


Vol. XXXVI Parts III and IV 


BIOMETRIKA 


FOUNDED BY 


W. F. R. WELDON, FRANCIS GALTON anp KARL PEARSON 


MANAGING EDITOR 


EK. S. PEARSON 


ASSOCIATE EDITORS 
M. G. KENDALL JOHN WISHART 


in consuliation with 


HARALD CRAMER R. C,. GEARY 
J. B. S. HALDANE 


Reprinted by offset-litho, 1960 


ISSUED BY 
THE BIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON 


PRINTED AT THE UNIVERSITY PRESS, CAMBRIDGE 


[Issued 23 January 1950] 














Si 


al 


m 


Ww 


See 











ey eo a 








VoLuME XXXVI, Parts IIT anp 1V DECEMBER 1949 


THE ESTIMATION OF THE PARAMETERS OF 
TOLERANCE DISTRIBUTIONS 


By D. J. FINNEY 
Lecturer in the Design and Analysis of Scientific Experiment, University of Oxford 


1. INTRODUCTION 


In many types of investigation concerning the distribution of tolerances in a population, 
direct measurement of tolerance for each individual in a sample is impossible; the only 
practicable procedure may be to subject individuals to specified ‘doses’ of the stimulus 
under test and to record whether or not they show a particular response. For example, in 
the evaluation of the toxic properties of insecticides, a different dose of an insecticide may 
be given to each of several batches of insects and the numbers of deaths and survivals 
recorded. Insects killed by a certain dose are known to have a tolerance lower than that dose, 
insects surviving are known to have a higher tolerance, but only the one test can be made 
on each insect and the parameters of the distribution of tolerance can therefore be estimated 
only by comparison of the proportions of responses recorded for different doses. Other 
examples of data of this kind have been given elsewhere (Finney, 1947 a). 

A method of analysis based upon the probit transformation is now commonly used to 
assist the interpretation of quantal response data of the kind just described. The writer has 
published (1947a) a detailed account of the standard probit technique for estimating the 
parameters of a normal distribution of tolerance. The purpose of the present paper is to 
derive and illustrate a general method, applicable to tolerance distributions other than the 
normal, which includes various modifications in the specification of the problem such as 
those already discussed by Finney (1944, 19476) and Wadley (1949). 


2. THE EQUIVALENT DEVIATE TRANSFORMATION 


Suppose that each member of a population has a tolerance, u, in respect of some stimulus, 
and that the distribution of u depends only upon a parameter of location, 4“, and a para- 
meter of scale, c. The distribution may then be written 


1 .(u-y 
—* 1 
ot ( o -, (1) 
where f(v) is a function containing no unknown parameters and 
| f(v)dv = 1. (2) 


Either or both of the tails of the distribution may have finite limits, but for convenience 
these cases may be regarded as contained within the infinite limits without affecting the 
general theory; the limits of the distribution, of course, are assumed to be independent of 
the parameters. The probability that a stimulus whose measure on the same scale as w is 
x will produce the characteristic response in an individual chosen at random is 


p= fos" au. (3) 


\ 


Biometrika 36 16 








240 Estimation of the parameters of tolerance distributions 


For example, in a population of insects whose tolerance of a certain insecticide is distributed 
normally, the probability that a dose x would kill a particular insect is 


~ 1 (u—p)*]. 
P= |" Jena? “see au 


The units for uw and x need not be those in which the experimenter habitually measures, and 
often a dose metameter, such as the logarithm of the concentration of insecticide, is required 
in order that the distribution shall take the right form. 








Write now an 
v =~" = a+ pu say, (4) 
where a=-yplo, B= Io. (5) 
Then the general expression for P may be written 
a+fr 
P= | f(v) dv. (6) 


For any specified function f(v), a quantity Y, the equivalent deviate of P, may be defined as 
a monotonic increasing function of P by the equation 


¥ 
P= | f(v) dv. (7) 
In particular, for a normal distribution of tolerances 
) = Bi. e—te? 
and Y is the normal equivalent deviate of P, differing from the probit of P only by omission 
of the conventional addition of 5. Equations (6) and (7) then show that the relationship 


between the probability of response and the measure of the stimulus, which is equivalent 
to a statement of the tolerance distribution, may be expressed by 


Y=a+/ex. (8) 
Consequently, the procedure for estimating the parameters « and o may be regarded as 
finding a linear relationship between x and the Y-transform of the probability of response; 
this probability is itself estimated from experiments by p, the proportion of individuals 
responding to a particular dose, and equations (5) enable estimates of ~ and o to be obtained 
from a and £. When tke proportion responding to a particular dose can be directly observed, 
in an experiment in which a known number of individuals are given the dose and those 
responding to it are countec., the sequence of calculations for solving the maximum likelihood 
estimation equations is tue analogue of the standard probit calculations corresponding to 
the form of tolerance distribution adopted. The theory of this, with examples of its conse- 
quences for several tolerance distributions, has been given elsewhere (Finney, 1947 a, 6). 


3. THE ESTIMATION OF THE PARAMETERS 


Experimentation leading to direct observation of values of p for given values of x is not always 
possible. For example, the population may contain an unknown proportion of individuals 
who will respond whatever dose is given, or the total number of individuals receiving a 
particular dose may have to be estimated from a parallel sample instead of being counted. 
Methods of analysis for these two problems have been described (Finney, 1944; Wadley, 





dis 


M 
kK 
t 
I 
t 
c 


uted 


and 
ired 


AYS 
als 
ga 
ed. 
ey, 


D. J. Finney 241 


1949), and their similarity suggests that they are particular instances of some more general 
theorem. Such a theorem will now be derived. 

Suppose that the tolerance distribution for individuals is that given above as (1), so that 
the probability of response at dose x due to the action of that dose is P (= 1— Q, say). Suppose 
further that an experiment is performed in which r, the number of individuals responding, 
can be observed for a series of values of the dose, x, and that, under the conditions of experi- 
ment, the probability of exactly r responses at a particular z is 


P(r) = F(J +KQ); (9) 
here J and K are additional parameters which in general will also require estimation, and 
F(_ ) is a known function. For example, if a proportion C, of the population will respond 
even to zero dose and a proportion C, is unable to respond however large the dose, then the 
probability that an individual selected at random will respond when it receives a dose z is 


P* = (1-C,)—Q(1-C,-Q). (10) 


If n individuals are subjected to a dose x, the number of responses will have a binomial 
distribution based upon this probability: 


Pe) = (") Peow, (1) 


which is of the form of equation (9). Unless C, and C, are known, direct estimation of P from 
tests at a particular x is impossible, and the whole data must be used in order to estimate also 
the additional parameters. If C, is known to be zero, for a normal distribution of tolerances 
this reduces to the problem of taking account of ‘natural mortality’ in probit analysis 
(Finney, 1944, 1947a, Chapter 6). 

The logarithm of the likelihood of the results obtained for a series of doses may be written 

L = Slog P(r) = Slog F, (12) 
where the summation denoted by S may include doses for which x->—00, for which Q is 
known to be unity, or x > + 00, for which Q is known to be zero. If both J and K are unknown, 
the maximum likelihood estimates of x, £8, J, K will be obtained by equating to zero the 
partial differential coefficients of Z with respect to each of these; if, as sometimes happens, 
the conditions of the problem specify J or K exactly, the corresponding equation will be 
omitted. 
' P a 
Now - = —f(v) =-—Z, say, and me = —2Z, (13) 
. op 

where Z is the ordinate of the standardized tolerance distribution, f(v)dv. Consequently 
the maximum likelihood equations may be written 


OL mm 4 ~~. J 
Ogg Sr. See 
aL ZF’ aL QF’ ste 
F) x 
ae a R=) Sagem Sg 
where F'(A) = (FA). (15) 


Equations (14) can be solved most easily by an iterative process which calculates adjust- 
ments to provisional values of a, 8, J, K. The adjustments may be calculated from Taylor- 
Maclaurin expansions to the first order, in the usual manner (cf. Finney, 1947@, Appendix 1, 

162 








242 Estimation of the parameters of tolerance distributions 


19476). If J, K are approximations to the estimates of these parameters obtained at the end 
of one cycle of calculations, an empirical rate of response to one of the doses tested may 
be defined as p (= 1 —q), the maximum likelihood estimate of P from the data for that dose 
alone when J and K have these values. Thus q is the solution of 








F'(J + Kq) = 0. (16) 
Hence, to the first order in (Q—q), 
F'(J + KQ) = K(Q—q) F’'(J + KQ). (17) 
The second differential coefficients of L, after substitution of the expected values of all 
parameters, are 227 g2R" a27 FP’ 
a a 
0a? I oJ? F 
oL a aL -gQZF’ 
— => 28 —_— ————- | A _ 
ae et een | 
en mata 0? ptQGor 
«-—-. sowie t is 
Cf? . F ° opeok - F (18) 
cL ya Ll QF" 
= ~ = S a Pe 
aod ASF aTaR ,* 
eL hee oL Q?F" 
rt eek a. 
pot . il ok? " F J 
Introduce now a weight, W, defined by 
K2Z? PF” 
Pesta 19 
W F (19) 
and two auxiliary variates, t,, t,, defined by 
1 Q € 
ty — ~~ KZ’ ty = as (20) 


If a, £, J, K are approximations to the maximum likelihood estimates, obtained by a 
graphical method, by rough calculation, or from a previous cycle of the calculations about 
to be described, the quantities defined by equations (19) and (20) may be formed for each 
dose, with the aid of equations (6), (9) and (13). Adjustments to the estimates, da, 62, dJ, dK 
are then the solution of a set of linear equations whose coefficients are given by equations 
(18); these equations may be written: 


daSW +68SWa+dJSWt,+dKSWt, = 8 WS 1) 





SxS Wa + OBSWa? + dJSWat, + 6K SWat, = 8 Wea 254) ; 





daS Wt, + 5BSWat, + dISWH+ dK SWht, = SWe,| _ 3 1) 





daSWt, + 6BSWat, + dJSWt,t,+dKSW eB = 8 We, o-4) 


{ If the approximations to J and K differ much from the maximum likelihood estimates, some 
values of p may be negative or greater than unity. Since p, g are introduced only as aids to the solution 
of the maximum likelihood equations, these values are nevertheless to be used in all that follows. 








ar 











D. J. Finney 243 


The working equivalent deviate (analogous to the working probit), y, must now be intro- 
duced. It is defined as @-~¢ 


ies Y¥+> > (22) 


where Y, Q, Z, q are determined from tie approximations to the estimates of the parameters 
by means of equations (8), (7), (13) and (16). The introduction of the symbol 9 for the weighted 


mean of a variate @, 8 = SwojsW, (23) 
and S,, for the weighted sum of products of deviations of any two variates 0, ¢, 
Sos = SW(0-6)(¢-¢) 
= SW0¢-—(SW@)(SW¢)/SW, (24) 


enables the equations, after some rearrangement, to be reduced to 


a, = ¥—£,z7—-dJt, —bKi,, (25) 

and 6,8, + 65 Sy, +éKSy, = Szy, 
BS, + OI S,y, + OKS,y, = Spy, (26) 

By Sy, + OSS, 4, + SKS, 4, = Spy. 


Here a,, /, have been written for the adjusted values (a + da), (8+ 4f), but in general this is 
unnecessary as there is little danger of confusion. Equations (26) show that /,, 6J, 6K are 
the weighted partial regression coefficients of y on 2, t,, tg respectively. 

Providing that no doses are included for which P is known absolutely, instead of merely 
estimated, equations (26) are complete as shown. Often, however, ‘control’ batches of 
subjects are tested in the absence of the stimulus. If the dose scale is a simple measure of 
weight or concentration, this will correspond to x = 0 in a tolerance distribution whose tail 
does not extend below x = 0; if a logarithmic dose metameter is being used, the control 
batches have x->—0o. For these controls, @ = 1 is known, and therefore they make no 
contribution to information on « and f; it does not follow that g, derived from equation (16), 
is unity, and the controls do give information on J and K. Examination of the method of 
derivation of equations (26) shows that (— F'”/F), evaluated for Q = 1, must then be added 
to S,1,,S,e, and S,,,,and that K(1—q) F”/F must be added to S,, and S,,,. The other coeffi- 
cients of the equations remain unchanged. Less commonly, a maximal dose (x +00) is 
given, for which the conditions of the problem require Q = 0 exactly. From equation (9), 
it is clear that the subjects so tested can give information only on J, which is therefore in 
some way a measure of that part of the population of subjects immune to the stimulus. 
Only two coefficients of equations (26) are affected: (— F"/F) must be added to S,,, and 
K(1—q) F"/F must be added to S, ,. 

After the solution of the equations, a second cycle of calculation may be performed, using 
a, By, J, (= J+dJ), K, (= K+6K) in place of a, £, J, K, and iteration may continue as long 
as seems desirable. Of course, in the limit 6J, 6K become zero and f becomes simply the 
weighted regression coefficient of y on x, but for practical purposes the iteration need seldom 
be carried to the stage at which the adjustments to J and K are negligible. 

If & doses in all have been tested (including any for which « ->—0o or x>+00), a x? with 
(k—4) degrees of freedom will test whether the observations agree satisfactorily with 
expectations based upon the parameters. This x? can be calculated by comparison of observed 








244 Estimation of the parameters of tolerance distributions 


and expected frequencies. A simpler calculation, correct when the maximum likelihood 
estimates are used throughout, is 


Nika) = Syy— PSzy, (27) 
and even when the iteration is not carried to the limit a satisfactory approximation will 
usually be given by 

oe ; Xba) = Syy — BSzy — 8 Spy — EK Spy (28) 


The x? calculated by either of these formulae corresponds to the usual expression 





g (observed-expected )* 


_ expected 


taken over all classes, and consequently does not allow for the undue influence of very small 
expected frequencies in some classes; this point has been discussed elsewhere (Finney, 
1947 a), and the same method of dealing with it (calculation of each expectation and grouping 
of classes) is required here. Equation (28), however, is often good enough and saves the more 
laborious calculations previously used (Finney, 1944; Wadley, 1949), except when borderline 
values indicate the need for more detailed examination. 


For data showing no heterogeneity (i.e. a non-significant y?), the variances of the estimates 
of the parameters may be found in the usual manner. In fact 


Vy) = 1/SW (29) 
and the variances and covariances of #, J, K are the elements of the inverse matrix 
S,, Sa, Su, om 
V={Sy, Sut, Sate 3 (30) 
Sty Sits Sits 


V will generally be calculated as a stage in the solution of equations (26). The mean 7 is 
uncorrelated with ~, J, K, and the variances of « and of other combinations of the para- 
meters can readily be derived. If there is evidence of heterogeneity, but not of a kind that 
makes the form of the tolerance distribution or some other characteristic of the analysis 
inappropriate, all variances must be multiplied by the heterogeneity factor; no new discus- 
sion of this point is needed (Finney, 1947a, §$ 18, 19). 

If the conditions of the problem specify J or K, éJ or éK will be everywhere zero, the 
second or third equation of (26) will be omitted, and x? will have (&— 3) degrees of freedom. 
If both J and K are specified, only the first of equations (26) remains, y* has (k — 2) degrees 
of freedom, and the calculations are those for the standard probit method, or for the corre- 


sponding method based on some non-normal tolerance distribution, just as described in 
a previous paper (Finney, 19475). 


4. APPLICATIONS AND ILLUSTRATIONS 
(i) Adjustment for natural mortality 


In the testing of insecticides, ‘control’ or untreated batches of insects frequently show some 
deaths during the time that elapses between treatment and classification of the treated 
batches. It may then be assumed that some of the deaths amongst treated insects would have 
occurred even if the insects have been untreated; they are due to natural causes and not to the 
insecticide. If an insect has a probability C of natural death, independently of the insecticide, 


——— 











re 


l 


) 














D. J. Finney 245 


a dose of insecticide, sufficient to give a probability of deati P amongst insects not dying 
from natural causes, will have associated with it a total death-rate obtained as the particular 
case of equation (10) with C, = 0: P* =1-—Q(1-C) (31) 


This assumes that the two causes of death operate independently of one another. 
If n insects all receive the same dose, and react to it independently of one another, the 
probability that r of them die is 


P(r) = (") P*(1— P*)™ = F{1—Q(1-O)}, (32) 


where, in the notation of § 3, J=1, K=-(1-C). (33) 


Since J is specified by the conditions of the problem, the second equation of (26) is not 


required. Now log F = const.+rlog P* + (n—r) log (1— P*). 


Differentiation twice with respect to the argument of F(_ ), followed by the replacement of r, 
(n—r) by their expectations nP*, n(1— P*) gives the expected value 
= nse 
F P*(1—P*)’ 
whence, using equations (19) and (31), 
_ - 8ZX1—C) 
Qf - Q(1—C)}° 


This may be regarded as equivalent to a weighting coefficient, or weight per individual, w, 
where Zz 


l > 
i=0-9 
to be multiplied by » in order to give the total weight for the batch. When C = 0, w reduces 
to the familiar formula for probit analysis; the present result, however, applies to tolerance 
distributions other than the normal providing that the appropriate Q and Z are used. 


Direct estimation of C rather than of (1 — C) is convenient, and for that purpose the auxi- 
liary variate ¢, of equation (20) is modified to 


t= Q/Z; (36) 


t,, of course, is not required since J is known. The factor 1/K is included in the auxiliary 
variate for theoretical work, but in computation it is most easily left as a multiplier of dJ, 
6K, of which account may be taken later. Equations (26) now become 


W 





(34) 


Roan (35) 


6C 6C 
BSi2+7—¢ —6 Su = Sy, BSy+ los = 8S, (37) 
. ae éC , 
and from equation (25), a= y—pz— Gt. (38) 


+ Horsfall (1945) has strongly criticized this assumption and any attempts to base methods of 
statistical analysis upon it. Extensive experimental studies will be needed before discussion of this 
matter is taken further; in the absence of these, equation (31) is a reasonable mathematical model to 
try, and experience has proved it to be adequate for the explanation of many sets of data. 








246 Estimation of the parameters of tolerance distributions 


Thus £ and 6C/(1—(C) are the partial regression coefficients of y on x and t, and when they 
have been found the revised estimate of the remaining parameter, a, is given by equation 
(38). The S,,, ete., are weighted sums of squares and products of deviations, the weight for 
any dose being nw. If some insects, say n, in all, have been included in the experiment but 
kept as untreated controls in order to provide a direct estimate of C, contributions from 
these will affect S, and S,,. Writing r, as the number of responses amongst these controls and 


c= 1,/n, (39) 


as the estimate of C based on the controls alone, the comments beiow equations (26) show 
that S, must be increased by 

n(1—C) 

ont Sink (40) 


Since the value of q for the controls is obtained from equation (31) by substitution of P* = c 
to give 


_ ie 
hed a , 
S,, must be increased by mea). (41) 


The calculations based on equations (37)-(41) are equivalent to those previously given 
(Finney, 1944, 1947a); the latter, however, were developed only for a normal distribution 
of tolerances, but the present theory shows the same form of analysis to be applicable to any 
distribution providing that the appropriate relationship between Y and P (equation (7)) 
is employed. The method of calculation now recommended is simpler than the earlier version, 
especially as most of it follows the standard pattern for multiple regression. 

The provisional C used in any cycle of calculations may by chance be greater than some 
of the p* for low doses; for each of these, equation (i6) will then give q greater than unity, 
p negative. The corresponding P, being an expected value, is of course positive, and applica- 
tion of the formula for the working equivalent deviate (equation (22)) will enable the results 
for these doses to exert their proper influence in the calculations even though diagrammatic 
representation in the usual probit or other regression diagram is impossible. 

For a normal distribution of tolerances, ~ and o are respectively the mean and standard 
deviation. Values of the auxiliary variate, t, defined by equation (36), and of the weighting 
coefficient, w, from equation (35), have been tabulated by Finney (1947a, Table IT). As an 
illustration of the calculations, the example given in Chapter 6 of that book may be analysed 
again; this is of interest as showing how the natural mortality adjustment can be employed 
in data for a biological assay, in which two insecticides or other preparations are tested 
simultaneously. 

Preparations of two derris roots, W.213 and W.214 were tested on the grain beetle, 
Oryzaephilus surinamensis. The log concentrations, in mg. dry root per litre, are shown as 
x in the first column of Table 1. The columns, n, r, p* show numbers of insects tested, numbers 


affected (dead + moribund + slightly affected), and percentages affected respectively. In 
a control batch of 129 insects, 21 were affected, so that 


c = 21/129 = 16-3 %. 
Inspection of all the data suggested C = 17:0% 

















D. J. FINNEY 247 


as a first approximation to the required estimate; values of p were then calculated from 
equation (31) which may be rewritten 
p* —0-170 
~ 1=0-170 


In accordance with usual practice, these have been tabulated as percentages, instead of 
proportions, though in all formulae p is used as a proportion. 

The empirical probits of p (Finney, 1947a, Table I) were next tabulated, and, either by 
inspection of Table 1 or by plotting these probits against z, a column of expected probits, 
Y, was obtained from two parallel regression lines. The weighting coefficients, w, and values 
of ¢ for each Y were read from tables (Finney, 1947a, Table II), in order to give the next 
two columns in Table 1; for example, for Y = 7-6 and C = 17 %, w = 0-03298 and multi- 
plication by 142 gives 4-7 as the total weight for the first batch of beetles. The working probits, 
y, were also read from tables (Finney, 1947a, Table IV). 

Calculation and summation of columns for nwa, nwt, nwy gave totals from which &, #, 9 
were formed. Further multiplications by z, t, y led to sums of squares and products of devia- 
tions, found according to equation (24). All these were formed separately for the two roots. 
In order to form single estimates of # and C, corresponding sums of squares and products 
were added, together with the special contributions to S, and S,, from the controls obtained 


from formulae (39)—(41); details are shown at the bottom of Table 1. Equations (37) then 
take the form 


Y 


28-8295 — 76-5485 x, = 80-709, —76-5485/ + 897-1128 a = — 214-659. 


—C 
The inverse matrix of coefficients (equation (30)) is easily found as 


, _ (0-0448475 sno 
r Cessnaal 0-0014412) © 


(42) 


Therefore £ = 80-709 x 0-0448475 — 214-659 x 0-0038267 
= 2-7982, 
dC - . 
and i" 80-709 x 0-0038267 — 214-659 x 0-0014412 
= — 0-00052. 


Hence, by substitution of the provisional value of C, 


6C 


— 0-00052 x 0-83 
— 0-00043. 


Il 


From equation (38), and Table 1, the remaining parameters for the regression lines are 


a, = 5-5980 — 2-7982 x 1-4417 + 0-0005 x 1-1325 
= 1-564 
and Qt, = 5-8841 — 2/7982 x 1-3017 + 0-0005 x 1-3233 


= 2-242. 











vons 


of the parameters of tolerance distribut 


ton O 


Estimat 


248 






























































ram 3 > 8 8 
og- 18% 699-F1Z — 60L-08 8Z11-L68 C8FS-9L — G6Z8-86 s[BIOL, 
e1e-¢ — = O/(o—-9)u 228-629 = O/(O—-1)’u s[orqU0g 
18-081 800-LFI — Zt8-1F ZPGS-LZS 18L6-€9 — eI11-SI 
LI-PE8E Z98-6FL BLE-LEL ESt9-S91 LI88-S91 CE9T-£91 
SF-FShE $28-Z09 CIP-6LL C66F-968 9£06-111 OFLE-BLI v1 M 
2-111 688-29 — LOB-8E ISEh-6E FOLS-Z3 — SSIL-EI 
99-9808 9ZE-F19 OF0-Z8L £08Z-FZ1 2602-891 910F-10Z 
06-LFTE 186-199 L06-028 FOTL-£91 16£9-SEI S61T-S1Z €1Z°M 
zimus fynug fizxnugy gunuy jrmugy grnuy 
"Tres ="A ‘geze-1 =") ‘2108-1 =*e FIZ “A 
‘osege-¢ = "A ‘ozer-t =") ‘LIbe1 = "© +EIs "MH 
75 L826. Bee ee ee ee 
8704JU09 
| 
0F9-99¢ PREF-LZ1 OLE-SZT £-96 | | | | 
F9F-89 3£0Z-08 2£0-01 68-E | LEG-F OLT Sf | 68-8 eel | O8¢ | Le Ze Lo-0 | 
991-621 4090-02 £6L-9 $9-9 918-0 6-23 oo | $9 O-FL SL | OF Ig LUT | 
891-261 9626-91 98L-9F 1-9 | 6£9-0 18 +9 «| 919 L+L8 8-68 FIT LI 6rT | 
LEL-FOI 39Z2°9 PEL-FS £0°L 8Et-0 6-FI 69 | SOL 0-86 €-86 SII LII 99-1 | 
GOT-L OFZL‘E C00-LI 6S-L | 266-0 | o6 OL | Go 0-001 | 0-001 SZI SZI 6L-1 
FIZ" M 2004 
OSF-ZF9 9681-601 | 669-681 6-96 | 
GL6-81Z 0099-48 00€-1¢ 19-F O8L:I o-Lt 9-F 19-4 6-FE 0-9F 89 921 80-1 
0S3-S1Z OSLL-61 008-89 st-9 699-0 0-98 £-9 91-9 L-L8 8-68 SIt | sal 89-1 
L06-0L 4208-8 00F-61 T€-L 268-0 L‘6 aL €E-L 0-66 3-66 9Z1 LEI | «(00% 
STE-LE 1Z19°1 661-01 P6-L £FE-0 Lt 9-L oo 0-001 0-001 orl SI, LTB 
€1Z'M 20047 
if 
| gqoid | (0-L1 =9) 
finu qnu amu fi 2 | mu A yeourdugy d | 9@ 4 u # 
| | 
| 1 
































(suot4s0doid se pesn ore oB[NULIOJ UT 4Nq ‘a[qQ~Vq oY} UI Sse#eyUsoIed sve UMOYs ore d pus ,d) 


sisuoweULINs snitydeezAIGQ OF FIZ PUP SIZ MM $1004 stasap fo innxoy “1 248], 





eee Oa 











D. J. Finney 249 

The revised regression equations are therefore 
Y, = 1564+ 2-7982, Y, = 2-242+ 2-798z, 

and the estimate of the response rate amongst controls is 

C = 0-17—0-0004 

= 16-96 %. 
Evaluation of Y, and Y, for the appropriate values of x gives a new set of expected probits 

which are very close to those in the Y column of Table 1, and the revised estimate of C is 


almost identical with the earlier approximation. Consequently, there is no need to undertake 
any further cycle of calculations. As a test of heterogeneity, derived from equation (28), 


Xie) = Syy — BSzy— Pm Sy (43) 


= 231-55 — 2-7982 x 80-709 — 0-00052 x 214-659 

= 5-60. 
There were 10 dose levels tested, including the controls, and four parameters, a,, a, 8, C, 
have been estimated, leaving 6 degrees of freedom. Clearly there is no indication of hetero- 
geneity. Had x” been larger, the possibility that large contributions came from classes with 
very small expectations would have required consideration, and grouping of classes in the 


usual manner might then have been needed. The previous analysis (Finney, 1947a, § 28), in 
which some classes were grouped together, gave 


Xe = 1-07. 


Since there is no heterogeneity, the matrix V in equation (42) gives the variances and 
covariance of # and 6C/(1—C). In particular 


V(f) = 0-04485, 


whence B = 2-798 + 0-212. 
and V(C) = 0-0014412 x (0-83), 
whence C = 16-96 + 3-15 %. 


The logarithm of the relative potency of the two derris roots, or the logarithm of the ratio 
of equally effective doses, is 


M = p,- fy 
= (a, —a)/P 
ao sl a ee 
S tz — 9. — B(%,—Z,)— 1-6 (4, - ih |e (44) 
= —0-242, 
Since g = #V()/B 


= 0-022 








250 Estimation of the parameters of tolerance distributions 


is small (note that ¢ here is the 5 % unit normal deviate, or 1-96, and has no connexion with 


the auxiliary variate), the variance of M can be found and used according to standard 
rules. The result is 


+e l bc Ne éC 
vim) = 5a, 7 GbE V(; 


,Snw .Snw 





». 
J 


+2(M +2, —2,) (fy) Cov. (2. ol +(M+2,-2,)? vip) | 





1 1 
= ———— ‘cease -1C 2 " i -] ® ): 27 
seat oeat 1908)? x 0-001441 +2 x 0-102 x 0-1908 x 0-003827 
+ (0-102)? x 0-044848 | ~ 7-8288 


= (0-010320 + 0-010384 + 0-000052 + 0-000149 + 0-000467) + 7-8288 = 0-00273. (45) 
Hence M = — 0-242 + 0-052. 


By taking antilogarithms of M and of M decreased or increased by 1-96 x 0-052, the state- 
ment may be made that W .213 has a potency 57-3 % of that of W. 214, and that the fiducial 
limits of this estimate are 72-4 and 45-3 %. All these results are the same as in the previously 
published analysis, but the method of calculation given here is simpler and more straight- 
forward. Calculation of the ‘exact’ fiducial limits, as required when g is large, presents no 
special difficulties, and formulae given elsewhere (Finney, 1947a, formula (4.7)) may be 
adapted for the purpose. 
(ii) Wadley’s problem 

Wadley (1949) propounds and solves an interesting variant of the standard probit problem. 
In experiments on the control of immature stages of fruit flies, samples of fruit may be 
treated and the number of flies which survive and develop may be counted. The total number 
of flies treated, however, can be discovered only by dissection of the fruit and counting of the 
dead flies, a laborious procedure. An alternative is to take a parallel sample of untreated 
fruit and to use the number of flies developing from this as an estimate of the numbers exposed 
to treatment in the other samples. Under certain assumptions, this problem also comes under 
the general theory of §3, and the calculations based upon Wadley’s solution can be simplified. 

If before treatment insects are distributed at random in the fruit, so that the number in 
the standard size of sample used in the tests is a sample from a Poisson distribution of 
mean JN, the probability of observing s survivors in a sample subjected to a treatment which 
has a probability P of killing an insect is 


8 
e-NQ e 2) (46) 


Now equation (9) was given in terms of the probability of r responses, but it could equally 
well have referred to the probability of s non-responses and the same theory would have 
followed. Hence equation (46) shows the general theory to be applicable to the present 


problem, with Ja@, Kad. (47) 


It is not essential that an estimate of N from an untreated sample should be available, any 








—- 








D. J. Finney 251 


more than in the problem of § 4 (i) an estimate of C from a control batch was essential, but 
the precision of estimation of the parameters may be much reduced if this information is 
lacking. 

From equation (46) log F = const.— NQ + slog NQ, 
whence, by differentiating twice with respect to the argument of F( ) and put‘ing s = NVQ, 
the expected value of F’/F is found as 


RE: 8 
¥F =~ NQ 
Hence, from equation (19), W =NZ/Q (48) 


This is equivalent to a weighting coeflicient 


w= 2/Q, (49) 


multiplied by N’, the expected number of flies tested at a dose. The weighting coefficient is 
dependent only on the form of the tolerance distribution and may easily be tabulated. The 
auxiliary variate 


=-Q/Z (50) 
is introduced, and equations (26) then give 

¥ oN Y 1 oN Y 

S.2+ N Sy = Sry» Sata Sy 7 Sy- (51) 


The contributions from a control, untreated, sample are very simple: S, must be increased 
by N and S,, by (s)—N), where s, is the number of flies developing from this sample. 

The procedure for estimation of N and the parameters of the tolerance distribution then 
follows the usual plan. First a provisional value of N is guessed for the data, s, being a useful 
guide to this. The empirical proportion killed at any dose is taken as 


8 
= —_— 2 
p=1-5, (52) 


from which empirical y-values (whether probits or as defined by some other tolerance dis- 
tribution) are found. A provisional regression line then leads to values of ¢t and y, from which 
equations (51) are constructed. The solutions of these, with 


a = ¥—pz-—i, (53) 


give revised estimates of the three parameters, and iteration may proceed for as many cycles 
as seem needed. The calculations are exactly equivalent to those given by Wadley (1949), 
but they simplify the method, especially because they invoke familiar ideas of regression and 
lead to a test of heterogeneity that does not require the calculation of every expected 
frequency. 

On account of sampling variation in the actual numbers of individuals tested at different 
doses, values of s for some iow doses may exceed NV, and the empirical proportion surviving 
according to equation (52) is then greater than 1. The data for these doses will play their 
proper part in increasing the estimate of N if the calculations are performed exactly as 
described here, using a q greater than | or a negative p in the formation of each working 








252 Estimation of the parameters of tolerance distributions 


equivalent deviate. Though the method of calculation recommended in this paper has a 
close resemblance to multiple regression methods, it is in reality a method of solving the 
complicated non-linear maximum likelihood equations; the occurrence of a g greater than 
unity is an indication that the analogy is not perfect, and not a condemnation of the method. 

Once again equations (49)-(53) are perfectly general and apply to any tolerance distribu- 
tion of the type of (1). Chief interest, however, will attach to the normal tolerance distribu- 
tion. Wadley has given a table of the weighting coefficient for this distribution; Table 2 is 
an extension of this comparable to the usual tables of probit weighting coefficients. The 


Table 2. The weighting coefficient, Z?/Q 














Expected Expected 
probit 27/Q probit Z*/Q 
= iy 
1-1 0-000 000 04 5-1 0-34242 
1-2 0-000 000 09 5-2 0-36344 
1:3 0-000 000 2 5-3 0-38069 
1-4 0-000 000 4 5-4 0-39359 
1-5 0-000 000 8 5-5 0-40173 
1-6 0-000 002 5-6 0-40488 
1-7 0-000 003 5-7 0-40296 
1-8 0-000 006 5-8 0-39612 
1-9 0-000 01 5-9 0-38466 
2-0 0-000 02 6-0 0-36904 
2-1 0-00004 6-1 0-34983 
2-2 0:00006 6-2 0-32770 
2-3 0:00011 6-3 0-30338 
2-4 0:00019 6-4 0-27760 
2-5 0-00031 6-5 0-25109 
2-6 0-00051 6-6 0-22452 
2-7 0-00081 6-7 0-19848 
2-8 0-00128 6-8 0-17348 
2-9 0-00197 6-9 0-14993 
3-0 0-00298 7-0 0-12813 
3-1 0-00443 71 0-10829 
3-2 0-00647 7-2 0-09051 
3-3 0-00926 7:3 0-07482 
3-4 0-01302 7-4 0-06118 
3-5 0-01798 7-5 0-:04948 
3-6 0-02439 7-6 0-03958 
3-7 0-03251 7-7 0-03132 
3-8 0-04261 7:8 0-02452 
3-9 0-05491 7-9 0-01899 
4-0 0-06959 8-0 0-01455 
4-1 0-08677 8-1 0-01103 
4-2 0-10648 8-2 0-00827 
4:3 0-12863 8-3 0-00614 
4-4 0-15300 8-4 0-00451 
4:5 0-17926 8-5 0-00327 
4-6 0-20692 8-6 0-00235 
4-7 0-23540 8-7 0-00167 
4:8 0-26398 8-8 0-00118 
4-9 0-29189 8-9 0-00082 
5-0 0-31831 9-0 0-00057 
































rw | 


D. J. Finney 253 


auxiliary variate differs only in sign from that of equation (36) and may therefore be read 
from Finney’s Table II (19474). 

Wadley gives a numerical example of his problem, and the same data may be used to 
illustrate the method of calculation now proposed. The ‘dose’ is here measured by length of 
exposure to treatment. Wadley investigated the probit regression on number of days of 
exposure; this seems to lead to some inconsistency, for the regression equation so derived 
shows a considerable (about 20 °,) death-rate for day zero. The regression may be effectively 
linear between 1 and 6 days, yet depart from linearity between 0 and 1, but that would 
correspond to a peculiar form of time-tolerance distribution. As an alternative, use of the 
logarithm of the number of days as a dose metameter seems worth trying; this, in fact, appears 
to be in at least as good agreement with the data as Wadley’s supposition, and has the merit 
of internal consistency. 


Table 3. Effect of duration of treatment on development of fruit flies 















































| Pp Em- 
x 8 (N = 1034)] pirical = Nw t y Nwx Nut Nwy 
probit 
0-778 4 0-996 7-65 7-19 95 — 0-393 7-47 73-910 — 37-335 709-65 
0-699 32 0-969 6-87 6-90 155 — 0-438 6-87 108-345 — 67-890 1064-85 
0-602 55 0-947 6-62 | 6-54 | 249 — 0-507 6-61 149-898 — 126-243 1645-89 
0-477 158 0-867 6-02 | 6-07 368 — 0-632 6-02 175-536 — 232-576 2215-36 
0-301 396 0-617 5:30 | 5-42 409 — 0-923 5-29 123-109 — 377-507 2163-61 
0-000 715 0-309 4-50 | 4-29 131 — 2-455 4-52 0-000 — 321-605 592-12 
Controls | 1070 ve — | — Wee a ae 630-798 | —1163-156 | 8391-48 











x = 0-4483, t = —0-8267, y = 5-9641. 











SNwx? SNwzt SNwit® SNwery SNuwty SNwy? 
344-2602 — 377-068 1393-38 3995-24 — 6430-54 50954-28 
282-8046 — 521-476 961-57 3762-14 — 6937-17 50047-57 

61-4556 144-408 431-81 233-10 506-63 906-71 

1034-00 36-00 
1465-81 542-63 


Table 3 shows the second cycle of calculations for these data. The first cycle had given 
N = 1034, Y = 4-295+ 3-726z 


as approximations to the maximum likelihood estimates, x being the logarithm of the number 
of days. Consequently, for the second cycle equation (52) gave 


py artenill, 
1034 


as the empirical death-rate. The expected probit, Y, was calculated from the regression 
equation obtained in the first cycle; in view of the large weights of the observations, two 
places of decimals were taken. Linear interpolation in Table 2 gave values of w for each Y, 
and these were multiplied by 1034 to give the total weight for each observation. Since N is 
the same for each line of the table, this multiplication could have been deferred until the end 
of the calculations; it is sometimes convenient, however, to have records of the weights of 
each observation, and this arrangement can take account of differences in the sizes of sample 








254 Estimation of the parameters of tolerance distributions 


used at different doses by making the multiplier of w the appropriate fraction of N. Working 
probits, y, were formed from the table of Finney & Stevens (1948), in order to avoid inter- 
polation for Y, but linear interpolation in other tables would be sufficiently exact. 

The remainder of Table 3 follows the familiar pattern. To the sum of squares of deviations 


of t was added N, and to the sum of products of deviations of ¢t and y was aided (s,— NV), or 
36-00. Equations (51), for # and 6N, then became 


y 


j ON 
61-45568 + 144-408 kd = 233-10, 144-4088 + 1465-81 W = §42-63. 


The inverse matrix of the coefficients of # and 6N/N is 


y = 0-0211735 —0-002 0860 
~ \— 0-002 0860 0-000 8877)” 


. oN 
whence were derived £8 = 3-8036, — = —0-00455. 


N 


By substitution of the provisional value of N 
aN = —4-7, 


and the revised estimate of N is therefore 1029-3. Again 


so that the revised regression equation is 
Y = 4-255 + 3-8042. 


The iterative process could be continued from these results, but the new column of expected 


probits is so nearly the same as that in Table 3 that further calculation is not worth while. 
The heterogeneity test (equation (28)) here takes the form 


N 
Xia = 906-7 l —_ PSzy — 6! S (54) 


a i 
N y 
= 22-55, 


since 4 degrees of freedom remain after the estimation of three parameters. This y? is clearly 
highly significant, but Wadley found an equally great indication of heterogeneity in his 
analysis using ‘days’ instead of ‘log days’ for x; evaluation of the expected frequencies, NQ, 
corresponding to each s shows that the large * is not atcributable to a large contribution 
from one class with small expectation and also that the deviations of s from N Q do not follow 
any systematic pattern. Genuine heterogeneity of the experimental material, as judged by 
the standard of the Poisson and binomial distributions used in the derivation of the statistical 
technique, seems the most likely explanation. Non-normality of the distribution of log 
tolerances would show itself as a curvature of the probit regression line, but departure from 
a Poisson distribution, in the numbers of flies per fruit—a situation very likely to obtain in 
reality—might reduce the true weights of observations and so permit erratic. deviations 
from the regression line which would appear to be significant when the Poisson weights were 
used. Wadley has already suggested such heterogeneity. The theoretical consequences need 





———— 








ar 





is 
or 











D. J. FINNEY 255 


further investigation, but the difficulty is likely to be overcome in large part by the use of 
a heterogeneity factor (Finney, 1947a, § 18). This is 


x?/4 = 5-64, 
and is assigned 4 degrees of freedom. 
From the variance matrix, V, 


V(N) 


Il 


5-64 x 0-0008877 x 1034? 
= 5353, 
and therefore N = 1029-3 + 73-2, 


the standard error being based on only 4 degrees of freedom. The estimation of the mean 
number of flies exposed in each test is not very reliable and does not differ significantly from 
the figure of 1070 based on the controls alone. An estimate of LD 50, the ‘dose’, or in this 
example the time, expected to kill 50 % of the flies can be derived from m, the value of x 
which makes Y = 5. This is m = (5—a)/p (55) 


= 0-196. 


After expressing m in terms of 9, 8, SN, by use of equation (53), its variance can be shown to be 
: | ev(2) ae {, 5N)\ — | 
oi ie a os bad, zs 
V(m) alse"! i ( 2(m —%) # Cov. \P VI +(m—Z)? V() |, (56) 


when there is no heterogeneity, the variance and covariances being taken from the matrix V. 
Here, the expression on the right-hand side of equation (56) must be multiplied by the 
heterogeneity factor, giving 

V(m) = 5-64 x [0-0007 107 + (0-827)? x 0-000888 + 0-504 x 0-827 x 0-002086 
+ (0-252)? x 0-021174] + (3-8036)? 
= 0-00138. 
Hence the standard error of m is + 0-037. Since the heterogeneity factor is based on only 
4 degrees of freedom, this standard error must be multiplied by 2-78 in order to give the 
width of the 5 % fiducial interval. The fiducial limits to m are therefore 0-299 and 0-093. 
These values correspond to an estimated LD 50 of 1-57 days, with limits at 1-99 and 1-24 days. 
The legitimacy of using the standard error may be judged from the criterion 


g = 5-64 x (2-78)? x 0-02117/(3-804)2 
= 0-064. 


When g is less than 0-1, for most practical purposes the approximate method for obtaining 
the limits is sufficiently good; application of Fieller’s method for calculation of the true 
fiducial limits (Finney, 1947a, formula (4.7)) gives these as 1-94 and 1-17 days. 


(iii) Related problems 
The adjustment of standard probit analyses in order to take account of natural mortality, 
and the maximum likelihood analysis of Wadley’s problem are instances in which the 
theorem of §3 is exactly applicable. Other problems can be treated by very closely related 


methods. The fitting of the Parker-Rhodes equation (Finney, 1947a, §45), for example, 
requires that equation (8) be replaced by 


Y =a+ 2", (57) 


Biometrika 36 t7 








256 jstimation of the parameters of tolerance distributions 


where x is now the absolute dose, not its logarithm, and 7 is a third parameter requiring 
estimation. The maximum likelihood equations, which the writer has derived elsewhere, 
can be rearranged so as to make the calculations formally equivalent to those for a multiple 
regression on z and an auxiliary variate 


t = x‘ logz. (58) 


Quantitative responses whose relationship to dose may be taken as proportional to an 
integral such as that in equation (3) may also be analysed by a similar procedure; the method 
follows the lines suggested elsewhere (Finney, 1947a, § 47), but again simplifies the scheme 
of calculation by the introduction of an auxiliary variate 


t= P/Z, (59) 


with the aid of which a close analogy with multiple regression is brought about. Previous 
accounts of both these problems discussed them only for the case in which (1) is a normal 
distribution, but from the present paper it is clear that the methods can be readily adapted 


for use with other distributions dependent only upon a parameter of location and a parameter 
of scale. 


5. SUMMARY 


Existing methods for the analysis of data relating the dose level of some stimulus to the 
proportion of individuals showing a characteristic response are in effect methods for esti- 
mating the parameters of an underlying tolerance distribution. Usually the distribution is 
assumed to be normal in respect of some known dose metameter, and the probit method may 
then be used. Analogous computational procedures can be used for any other tolerance 
distribution which is completely specified by a parameter of location and a parameter of 
scale. In the present paper a general method is developed, applicable to any such distribution, 
which leads to a convenient computational routine when experimental conditions prevent 
direct observations of proporticus responding. Instances of practical importance are the 
adjustment of data required when a ‘natural’ respor.se rate is superimposed on that due to 
the stimulus, and the Wadley problem in which the numbers of individuals exposed to the 
stimulus must be estimated from a parallel sample instead of counted in the sample tested. 
When certain simple supplementary tables have been prepared, these and related problems 
can be dealt with by iterative calculations of the type now familiar in connexion with the 
standard probit method. The process becomes formally identical with that used for multiple 
linear regression, except that both the dependent and some independent variates are modified 


at the end of each cycle of iteration, instead of the unusual patterns of computation previously 
recommended. 


REFERENCES 


Finney, D. J. (1944). The application of the probit method to toxicity test data adjusted for mor- 
tality in the controls. Ann. Appl. Biol. 31, 68-74. 


Finney, D. J. (1947a). Probit Analysis: A Statistical Treatment of the Sigmoid Dose Response Curve. 
Cambridge University Press. 

Finney, D. J. (19476). The principles of biological assay. J. Roy. Statist. Soc. Suppl. 9, 46-91. 

Finney, D. J. & StEvENs, W. L. (1948). A table for the calculation of working probits and weights in 
probit analysis. Biometrika, 35, 191-201. 

HorsFA.., J. G. (1945). Fungicides and their Action. Waltham, Mass., U.S.A.: Chronica Botanica Co. 


Wan -ey, F. M. (1949). Dosage-mortality correlation with number treated estimated from a parallel 
sample. Ann. Appl. Biol. 36, 196-202. 

















—————$ 








[ 257 ] 


AN OVERLAP PROBLEM ARISING IN PARTICLE COUNTING 


By P. ARMITAGE, B.A. 


Medical Research Council Statistical Research Unit, 
London School of Hygiene and Tropical Medicine 


INTRODUCTION 


1. The problem with which this paper is concerned has already been briefly discussed by 
Irwin, Armitage & Davies (1949). In counts of dust particles on a sampling plate, the total 
number of particles present may be underestimated on account of the overlapping of some 
of the particles. ‘Clumps’ of particles formed in this way cannot be analysed under the 
microscope, and will be counted as a single particle instead of two, three, or more. A similar 
situation arises in bacterial counting, where the number of organisms present on a plate is 
obtained by counting the number of colonies which develop from them in a certain length of 
time. Two or more colonies which overlap may be counted as a single colony, in which case 
the number of colonies present will be underestimated. 

The purpose of the present investigation is to obtain a method of correcting an observed 
count of this sort, so as to allow for the possibility of undetected overlaps. A considerable 
gap between theory and praciice is probably inevitable, at least in the application to the 
counting of dust particles, because of the variation in the size and shape of the particles. 
It is precisely this variation which makes it impossible for an experimenter to distinguish 
a clump from a single particle. 

The simplest mathematical model is that suggested by Irwin et al. The particles are regarded 
as circular laminae of equal diameter 6, and we consider the formation of clumps when N 
particles fall at random on an area A. The concentration of the particles on the plate may be 
measured by the quantity yy = nN82/4A 


(y is the ratio of the sum of the areas of the particles to the area of the plate). We shall assume 
in the discussion of this model that for a given concentration y, N is sufficiently large, and 
6?/A sufficiently small, to make it justifiable, on the one hand, to neglect certain ‘edge 
effects’ due to the presence of particles near the boundary of the plate, and on the other hand 
to deal merely with the expected numbers of clumps of various sizes, neglecting completely 
the sampling variation of these numbers. 

It may be possible to avoid any bias due to edge effects by adopting the convention that 
clumps overlapping one half of the boundary are included in the count, while those over- 
lapping the other half are not included. In any case, for a given concentration y, the edge 
effects assume diminishing importance as N increases and 6?/A decreases. 

In the note previously referred to, Irwin e¢ al. have given an approximate formula for the 
mean clump size. Defining m = N/C, where C is the number of clumps on the plate, their 
formula is (in the present notation) 


n= ™ 
= 142+ OY). (2) 


17-2 








258 An overlap problem arising in particle counting 


The argument by which (1) is obtained involves the assumption that, in each clump, every 
particle overlaps every other one. This is not necessarily true for clumps of more than two 
particles, and we shall show that (1) underestimates the true value of m, and is actually valid 
only as far as the first two terms of the expansion given by (2). In §§ 2—4 of the present paper, 
an expression for m is obtained which is valid to order y?. For small values of y, such as 
should occur in a well-planned particle counting experiment, (1) gives, for most practical 
purposes, a sufficiently good approximation to the true value of m for this model. 

Two other models are considered in §§ 5 and 6, in which the particles are regarded (a) as 
circular laminae of different sizes, and (6) as rectangular laminae of constant proportions 
and different sizes. Expressions analogous to (1) are obtained in each case, involving the 
mean and variance of the square root of the area of an individual particle. 

Finally, we suggest a formula for the estimation of m in terms of quantities observable in 


an actual count, which should be applicable for the types of particles usually encountered 
in practical work. 


CIRCULAR LAMINAE OF EQUAL SIZE 


2. As a lemma, we shall need the probability density function (p.d.f.) of the distance r 
between two poir.ts placed at random inside an area A, which is large in comparison with r?. 
Garwood (1947) has given the p.d.f. of 7, when A is a square, a circle, or a rectangle, of unit 
area in each case. For a square, for example, the p.d.f. of r is 


f(r) = 2r(7—4r+r?) for O<r<l 
and P(r) = 2r(4sin 1/r+4./(r?-1)—r?-a-2) for L<r<,/2. 
For small r, we see that P(r) = 2ar + O(r?). (3) 


We shall clearly obtain the same limiting result (3) for a unit area A of any shape, provided 
that the smallest chord of A is large in comparison with r (a provision which must be made, 
for instance, in the case of the rectangle). 

The limiting expression (3) may be obtained directly from simple considerations. For 
convenience we shall denote the magnitude of the area (not necessarily unity) by A. Given 
the position of one point, r is less than some value r, if the second point falls within a circle 
of radius 7), with centre at the first point. Neglecting edge effects, the probability of this 
event is 

P(r <ro) = mel/A. 
Any complications due to the boundary will affect a proportionate area of order r,/A?! 
(provided that 7, is small in comparison with the smallest chord of A). Hence 
P(r <1ro) = (mrg/A) {1 + O(r?/A)*}, (4) 
as r2/A>0. 


On differentiating (4) with respect to 75, we have 


P(r) = (2mr/A) {1 + O(r?/A)}}, (5) 

which will be seen to be equivalent to (3). 
3. Any two particles (which we assume to be circular laminae of diameter 4) will overlap 
if the distance between their eentres is less than 6. Let X and Y be the centres of two over. 


lapping particles, and denote by S and 7' the circles with centres X and Y respectively, 
and radius 6. 


























— 





P. ARMITAGE 259 


A third particle will form a triplet with the given particles if it overlaps either or both 
of them. Two types of triplets may be defined: 

(a) A ‘chain’ triplet, in which the third particle overlaps only one of the given particles. 
This will occur if the third particle centre falls in one of the two non-overlapping parts of 
Sor 7. 

(6) A ‘complete’ triplet, in which the third particle overlaps both the others. This will 
occur if the third particle centre falls in the area common to S and 7’. 

The area of overlap of S and T is 


26? cos (7/28) —r ./(6?—r?/4). 


The probability that a third particle, falling at random on A, forms a chain triplet with the 
two given ones is therefore 


P,(6,r) = (2/A) {76? — 26? cos (r/28) +r ./(6?—1?/4)}, (6) 
and the probability that it forms a complete triplet is 
P,(8,r) = (1/A) {26? cos (7/28) — r ./(8? — r?/4)}. (7) 
If N particles fall at random on A, the probability that any two particles, chosen at random 
out of the N, furm part of a chain triplet is (asymptotically, for sufficiently large V) 
3 
K,(8)~N J g(r) (8,7) dr, (8) 
0 
where ¢(r) is given by (5); and the probability that they form part of a complete triplet is 
(again for sufficiently large V) 
3 
K,(6)~ v{ P(r) P,(6, r) dr. (9) 
1 
Any inaccuracies in (6) and (7) due to edge effects will be accounted for by the asymptotic 


nature of (8) and (9). From (8), (5) and (6), for a given value of yy = 7N6*/4A, und sufficiently 
small values of 6*/A, 


K,(8) = (4N/A If cari) {1 + O(r?/A)*} {18% — 26% cos“ (r/28) +r /(8® — r?/4)} dr. 


Using the results 


. 


| xcos-!xdx = }(2%*— 1) cos—!x— }x,/(1—2*) + constant, 


and fe J(l—a*) da = isin! a —ax,/(1—2*) + 223 /(1 —2*)} + constant, 
we have 
K,(6) = (4N2/A?) [rove - aot |e cos~! xdx+ wot [atl -2*)as| {1 + O(6/A)*} 
a stat + o(5)}- : (10) 


Similarly, from (9), (5) and (7), we find that 


K,(6) = n(n 23) 8¥ (1 40(5) |. (11) 








260 An overlap problem arising in particle counting 


Now, suppose that in a count of N particles we find, on the average, P, isolated particles, 
P, doublets, P,, chain triplets, and P,, complete triplets. We may evaluate P, easily, for the 
probability that a particle is isolated is the probability that no other particle centre falls 
within the circle of radius 6 with centre coinciding with the given particle centre. For a 
given y, and values of 5*/A sufficiently small, this probability is 


exp (—7N6?/A) = exp(—4y). 
For large ., therefore, 


P,~ Nexp(—4y) = N{1—4y + 8y? + O(y)}. (12) 


There are ‘C, pairs of particles, and the proportion of this number which overlap but 
which do not form parts of triplets is 





. . md? n(4n+3./3) NOt 
{eee 29 - Az ? 
for large N. Hence 


pan (AB_ alts 31S) NEM gayly (40+ 8) ya an 
2 " } 

In order to evaluate P;, and P,., we remark that out of the ‘C, pairs of particles, a pro- 
portion K,(6) will overlap and form parts of chain triplets, and a proportion K,(é) will overlap 
and form parts of complete triplets. Since each chain triplet contains two overlapping pairs, 
and each complete triplet three overlapping pairs, it follows that 


Py, = NC,.K,(6)/2 and Psy = XC,. K,(6)/3. 
From (10) and (11), for large N, 





3/30 N34 6,/3 
ea Az = Te (14) 
n(42 —3.,/3) N°64 2(47-—3./3 
and Py mie 3/0) Ne - aco — 24) Ny. (15) 


From (14) and (15), the total number of triplets, P,, is given by 


Py = Py + Pye = Mn+ 33) yys, (16) 
37 

We have in (12), (13) and (16) obtained asymptotic expressions for P,, P, and P;, valid 
for large N. (We shall henceforth use these formulae without necessarily indicating their 
asymptotic nature.) No account, however, has yet been taken of clumps of higher order than 
triplets. Since these large clumps each contain at least two triplets, and since a triplet will 
form part of a clump of higher order only if at least one other particle centre falls sufficiently 
near the triplet (which event is easily seen to have a probability of order y), it follows that 


the expected numbers of clumps larger than triplets are of order P,y, that is, of order Ny*. 
The number of isolated triplets is therefore, from (16), 


Py = P3+ N.O(y3) = N Mort 809) ye ows). (17) 
It is readily verified from (12), (13) and (17) that 


P,+2P,+3P, = N{1+ O(y)}, (18) 





—— 


ie 








8, 


2) 
ut 


4) 


5) 


8) 











P. ARMITAGE 261 
as we should expect. The total number of clumps, C, is given by 


U = P+ P+ Pi+N.OW) = N 1-294 EEN) yas oy], (19) 


An interesting point is that, if we were concerned merely to obtain the expression (19) for C, 
we need not evaluate K,(é). For, assuming on a priori grounds that (18) is true, we have, 
to order y*, 


C =(N-2P,-3P,)+R+P, 
= N—P,-2P, 


3 
= N-*G,| I, $(r) dr — Ky(8) — K,8)} — 280K, (6)/2+ K0)/3} 


= N- “Cl f. $(r) dr—K 03}, (20) 


a formula not involving K,(é) explicitly. It may easily be verified that (20) does in fact give 
the correct result (19). 

4. We are now able to compare various formulae which are available as approximations 
tom = N/C. 

For a finite value of y, provided N is sufficiently large and 6*/A sufficiently small, we have, 
from (19), 


1 2(4 
3 1 2y 4 2A 3 V9) yas ovysy (21) 
= 1-—2y + 1-564y* + O(y*). (21a) 
The equation (1) which is proposed by Irwin et al. gives, upon expansion, 
I/m = 1—2y + 8y?/3+ O(y>), (22) 


which agrees with (21) to the first degree in y. A comparison of the coefficients of y* in (21) 
and (22) shows that, at least for sufficiently small y, (21) gives the higher value of m. This 
result was to be expected, since the proof suggested by Irwin et al. takes no account of chain 
overlaps. 

It may seem preferable to obtain an expression for m by the following expansion derived 
from (21): 


m = \1—-2y+ oe 8 y+ O(y)) 
(2 
= 1429 4 2OTt3V9) yas cys) (23) 
= 142 +2-436y2 + O(y). (23a) 


However, (21) may be seen to give a higher value of m than (23), and since (21) undoubtedly 
underestimates the true value owing to our neglect of overlaps of higher order than triplets, 
it is presumably safer to use (21) than (23). 

In Table 1 are shown the results of some sampling experiments which were performed by 
Mr C. N. Davies and the author as a verification of (1), before the present results were 
obtained. Each experimenter placed 200 points randomly on a 100 x 100 lattice, by means 
of random sampling numbers. Circles of various sizes (giving different values of y\and centred 








262 An overlap problem arising in particle counting 


upon these random points were then drawn, and the clumps of various sizes were counted. 


The corresponding values of m = N/C are shown in the third column of Table 1. The standard 
errors which are also given are estimated by 


S.E.(m) = 8/,/C, 


where s is the estimated standard deviation (with divisor (C —1)*) of the distribution of 
clump size. 

The theoretical values of m obtained from (1), and from (21) (neglecting in the latter 
formula terms of order y*), are shown in the fourth and fifth columns of Table | respectively. 
It will be observed that the only experimental value of m which appears to differ significantly 
from either of the theoretical values is that for experiment (b) with yy = 0-016. Nevertheless, 
it may not in fact do so, because for low values of yy the estimated standard error will be fairly 
highly correlated with the observed value of m. If, therefore, the value m = 1-005 is smaller 
than the true value, its standard error will also be underestimated. 


Table 1. Comparison of experimental and theoretical values of m, for different values of 














Theoretical values of m 
, _ aNe Experimental value of m 
y= . - 
4A and standard error 
From (1) From (21) 
0-016 (a) 1-026+0-015 1-032 1-032 
(b) 1-005 + 0-005 
0-024 105 +0-02 1-05 1-05 
0-035 1:08 +0-04 1-07 1-07 
0-063 (a) 1:15 +0-03 1-13 1-14 
(b) 1-14 +0-03 
0-098 1-28 +0-05 1-21 1-22 
0-141 1-38 +0-07 1-31 1-34 
0-192 1-47 +0-08 1-43 1-49 
0-251 1-71 +0-12 1-59 1-68 




















The values of v occurring in dust-particle counting should be considerably less than 0-2. + 


The last two columns of Tabie 1 show that the difference between (1) and (21) is, for most 
practical purposes, immaterial. 


CIRCULAR LAMINAE OF UNEQUAL SIZE 


5. Suppose that the number of circular particles whose diameter lies between d and 
6+dé is Nf(d)dé (0<d<a@). 


The probability that a given particle S of diameter é is not overlapped by any particle 
of diameter between 6’ and é’ + dé’ is 


The probability that S is not overlapped by any other particle is therefore 


where v is the mean, and vy; the ith moment about zero, of the distribution of 6. 





TE 




















P. ARMITAGE 263 
In the notation of § 3, 


P= [x70 exp {— (aN /4A) (67 + 26+ v3)} dé 
i | ” Nf(8) {1 — (WN /4A) (82+ 28v + v3) + ...}d8 
0 
= N{1—(WN/2A) (v2 + v3) + B} 
= N{1—(mN/2A) (v.+ 2v?) + R}, 


where ?, is the variance of the distribution of 4, and RB is of order (Nv?/A)?, if we assume that 
vi is of order v‘*+/ (a condition likely to be fulfilled by any distributions arising in practice). 
As a first approximation to m, correct to the first order in (Nv?/A), we may write 


= P,+(N-A)/2 
= N{l—(aN/4A) (v, + 2v*)} 
and m = 1+(aN/4A) (v.+ 2v?). 


C=Rth | 


(24) 


We may note that the value of m given by (24) is midway between the two values obtained 
by substituting for 6? in (2), v? and v3 respectively. 

The problem of evaluating P,, in order to obtain an expression for m comparable with (21), 
has proved difficult, but we may note that, in the same way that (1) proves a better approxi- 
mation than (2) in the case of equal circles (since the coefficient of y* in the expansion of (1) 
is positive), so we may expect the analogous formula in the case of unequal circles to give 
a better approximation than (24). 

By an argument similar to that of Irwin et al., we obtain the approximation 


4 
2 (25) 
where y = 1N(v.+ 2v*)/8A. (26) 


It is convenient to replace v and v, by the moments « and s, of ,/a, the square root of the 
area of the circular particles. Since 


we have —=—=-— 


and (26) reduces to yy = N(p.t 2n*)/2A. (27) 
In the case of equal circles, putting ~, =-0 and x* = 76*/4, (27) reduces to 
y = 1N6?/4A, 


in agreement with the original definition of yy, as we should expect since (1) and (25) are the 
same formula. 








264 An overlap problem arising in particle counting 
RECTANGULAR LAMINAE OF UNEQUAL SIZE 


6. We shall assume that all the rectangles are of the same proportions, with sides 2/ and 
2kl in length, k being a constant. Suppose that the p.d-f. of I is g(l) (0<1<od). 

A given rectangle L with parameter / will be overlapped by a rectangle L’ with parameter 
l’, falling at an inclination f to L, if the centre of L’ falls within an area Q whose boundary is 
shown as the outer thick line in Fig. 1. The inner thick line in Fig. 1 is the boundary of L, 
and the dotted lines show the limiting positions of L’. 


Oo” 


A \ 2k!’ 








21 




















Fig. 1. Diagram showing the admissible area for the centre of a 
rectangle L’ overlapping a rectangle L. 


The admissible area Q is 
4ll’{(1 + k*) sin 8 + 2k cos B} + 4k(2? + 1’2). 


Assuming that all values of # are equally likely, the probability that L is not overlapped by 
any other rectangle is therefore 


exp 4 : | s | “(ANIA ) g(t’) [{(1 + &2) sin 2 + 2k cos BYU’ + k(2+02)] au'ap| 
= exp - | *(4N mA) gl’) [2(1 + k)2 Ul’ + 2k(2+0)] ar’ 
0 


= exp {—(4N/7A) [(2(1+ kj? lv + wk(l? + v3)]}, 


where v is the mean, and y; is the ith moment about zero of the distribution of 1. Hence 


P, | Ng(l) exp { —(4N/7A) [2(1 +k)? ly + wh(l2 + v5)}} dl 
0 


\| 


N{1—(4N mA) [2(1 + k)? v2 + 2rkvi] + R} 
N{1 —(8N/mA) [{(1 +k)? + 2k} v2 + wkv,] + R}, 


where pv, is the variance of the distribution of /, and R is of order (Nv?/A)?, assuming that 
vi is of order v*+/, 
As in §5, we obtain, as a first approximation to m, 


m = 1+(4N/7A) [{(1 +k)? + 2k} v? + akvy], 








land 
neter 


iry is 
of ZL, 


hat 





P. ARMITAGE 265 


and as a better approximation m= . ‘ 
_— 


where yr = (2N/7A) [{(1 + &)? + wk} v? + akv,]. 
Replacing v and v, by the moments y and 1, of Ja = 21./k, we have 


+k) 
y= (v/2y| pa +{0+ | aa]. (28) 


We may now compare (27) and (28), since in both formulae y is expressed in terms of the 


moments of the distribution of the square root of the area of the individual particles. 
For square particles, putting k = 1 in (28), we have 


of = N(uq+2-273n2)/2A, (29) 
a result, as we should expect, not very different from (27). 

As k>oo or k-> 0 in (28), y, and therefore m, increases indefinitely. This, also, was to be 
expected, since needle-shaped particles having areas comparable to those of circular particles, 


must be of infinite length, and each particle may therefore be expected to cross every other 
one. 


For k = | or k = 5, we have, from (28), 





yy = N(f.+ 3-292p*)/2A. (30) 


Even for such long rectangles the difference between (27) and (30) is not very serious, 


PRACTICAL APPLICATIONS OF THEORY 


7. We have, in §§ 5 and 6, proposed that the mean clump size should be estimated by the 
approximation . 
4y 
n= le’ (25) 
and have obtained, in (27) and (28), expressions for y appropriate to the cases where the 
particles are unequal circles, or unequal (but proportionate) rectangles. 

The expression for y appropriate to any particular practical situation will probably be 
intermediate between (27) and (30) (the latter formula corresponding to the case of rectangles 
with k = } or k = 5, which are much more needle-like than the shapes which are likely to 
arise in practice). The experimental worker should have some idea of the shapes which are 
likely to occur. In the absence of such knowledge, perhaps a suitable expression for 


no wy = N(jtg+2-5u2)/2A = NK/2A, say. (31) 


Using (31), then, and assuming that ~ and 7, are known, and that C and A are observed 
from the count, we may write N = mC, and, from (25), 


_2KmC/A 


a = l — e-2K CA _ 





Hence 1 —e-2KmCia = 2KC/A 


A Yo 
 ————— —— S) 9 
sais on soi xo)" (32) 


which may be used as an equation for the estimation of m. 








266 An overlap problem arising in particle counting 


We have assumed, in obtaining (32), that ~ and ~, are known, or at least may be estimated 
with sufficient accuracy. We may, for instance, be able to use results obtained from a previous 
count on the same type of particles, in which the concentration was so low that overlapping 
did not occur. If we have to estimate y and 7, from the count itself, as the mean and variance 
of the square root of the area of a clump, y will be overestimated by a factor 


1+O(w). 


Since (25) is in any case valid only to O(y), we may disregard this source of error. 


This problem was suggested by Mr C. N. Davies of the Medical Research Council Group 
for Research in Industrial Physiology, and I am indebted to Mr Davies, to my colleague 
Dr J. O. Irwin, and to Dr H. O. Lancaster of Sydney for some helpful discussions on the 


mathematical treatment. I should also like to thank Mrs M. G. Young for preparing the 
diagram. 


REFERENCES 
Garwoop, F. (1947). The variance of the overlap of geometrical figures with reference to a bombing 
problem. Biometrika, 34, 1. 


Irwin, J. O., ArmiTaGE, P. & Davrss, C. N. (1949). The overlapping of dust particles on a sampling 
plate. Nature, Lond., 163, 809. 





up 
ue 
he 
he 


ng 


ng 


{ 267 ] 


TABLES OF AUTOREGRESSIVE SERIES 
By M. G. KENDALL 


In my Contributions to the Study of Oscillatory Time-Series (1946) I gave in full four series 
which had been calculated from the formula 


Up, 9 +AU, + bu, = €,9, (1) 
with certain values of a and 6 and a rectangular random element ¢. These series have re- 
peatedly been used by other workers to exemplify or to verify theoretical results; see, for 
example, Yule (1945), Bartlett (1946), Quenouille (1947) and Orcutt (1948). It may, therefore, 
be useful to publish further series subsequently prepared for some studies which are not yet 
ready to appear. 

I preserve the numbering of the four series already published and of some additional series 


referred to, but not given, in a later paper (Kendall, 1949). The full set of series, including 
those now published, is as follows: 





No. of No. of | Nature of 











series terms | stochastic element | Ganseating equation 
(hee he Bin 5 | a ~| 
I | 480 Rectangular Utz — Lo Ley, + OS, = €,2 
| 2 240 LH Upy9 — 1-24, + 0-4, = € 59 | 
3 240 | - Ugo — L-Leyy + O- Buy = Eye 
4 240 | ” Use + 1-Ou,,, + O-5u, = E49 
5a | 400 ~ As series 1 
5b 400 is = 
5c 400 = ae 
5d 400 2 a 
5 (in toto) | 1600 i. Fm 
6a | 400 As series 3 
| 6b | 400 a . 
6c 400 = + 
6d 400 “ ne 
6 (in toto) | 1600 }4 i 
7 500 Normal Uieg 1 — 0-9, = E41 
| 8 | 500 Series 7 As series 1 
9 500 Normal Ug, — OTe, = Ey, 
10 500 Series 9 | As series 1 
11 500 Normal Ue.) — OSU, = € 44 
12 | 500 | Series 11 As series 1 
13 500 Normal Uy) — O-Bu, = 4 
14 500 | Series 13 As series | 
15 500 | Normal Uy, — Or Le, = 4) 
16 500 Series 15 As series 1 





i 





The stochastic element for series 1—6 inclusive was obtained by taking two-figure numbers 
from the Tables of Random Sampling Numbers by Babington Smith and myself, ignoring 
00 and reducing to zero mean by subtracting 50. The ‘normal’ element for the odd-numbered 
series 7-15 was obtained from the tables by Mahalanobis and others (1934), which are to 
three decimal places and were derived by converting to normal deviates the Tables of 
Random Sampling Numbers by L. H. C. Tippett. (When the work was done H. Wold’s (1948) 















268 Tables of autoregressive series 


conversion of the Kendall-Babington Smith numbers was not available.) The values of 
these series were themselves used as the stochastic elements in constructing the even- 
numbered series 8-16, so that these latter have the same structural constants as series I, 
but are based on a stochastic element which is itself autocorrelated in a Markoff chain. 

If 9 obeys the relation 


Merit ky = Ey, (2) 
then Upp gt AU, + buy = eye 
is equivalent to Uy43 + (+k) ty,9+ (b+ak) uy,, + bku, = &,3, (3) 


that is to say, to a third-order autoregressive series with a random stochastic element. 
Evidently more complicated series may be built up, if desired, from those here given. Con- 
versely, any linear autoregressive series can be regarded as built up from those of the first 
or second order by an iterative process in which the stochastic element at each stage is the 
value of the series senerated by the previous stage. 

In addition to the series themselves I give, in Tables 19-22, the serial covariances (or 
rather, the serial first-product-moments-about-zero) and the serial correlations for all series 
except the odd-numbered series from 7 to 15, which have not been calculated. A few points 
require explanation: 

(a) ‘ Starting up’ the series. Series 1-4, the subseries of series 5 and 6, and the odd-numbered 
series 7-15 were started up by taking w_, and «, to be random numbers of the same character 
as the stochastic element and beginning the series-with u,. The subseries of series 5 and 6 
were started up separately. For the even-numbered series 8-16 u_, and uw, were taken to be 
the first two terms of the corresponding odd-numbered series, two extra terms of the latter, 
numbered 501 and 502, having been computed so as to give 500 terms of the derived series. 

(b) Product-moments. The theoretical mean of all the series is zero (and the actual means 
for series of these lengths are very close to zero); and there are both practical and theoretical 
reasons for calculating product-moments about zero rather than the actual means. I denote 
the first-product-sum-about-zero ¥ u,u,,; by c, and give the values of c, for various ranges 
of k in Tables 19-22. 


For series 1-4 and 7-16 the sum c, is taken simply over the values of the series, i.e. is the 
sum of n —k terms beginning with u,u,,, and ending with u,_,u,. To simplify the arithmetic 
for the subseries 5a—6d the sum is taken over 400 terms, the extra k being obtained by a 
circular definition; that is, c, is the sum of terms beginning with u,u,,,, proceeding through 
Uso0-K M500 tO Uso1_~ Uy, Usoo-4 Ue, etc., and ending with w5o)%,. For the full series 5 and 6, c, was 
taken as the sum of the four corresponding values for the constituent series. 


n 
(c) Serial correlations. As an estimate of the variance, I used cy = ¥ u?. The estimate of 
1 


the kth serial correlation r, is then 
r= (4) 


but where, as in series 5 and 6 and the subseries thereof, the summation of c, is over n terms, 
the factor n/(n —k) is replaced by unity. 


The estimated serials for series 1-4 published in my brochure were based on the formula 


COV (tt, U 
r, = —— (Wee) (5) 
{var u, var u,,;} 





M. G. KENDALL 269 


and hence may differ slightly from the values of Tables 19 aind 20. The differences are negli- 
gible for most purposes, at least for series of these lengths, but for short series (4) is better 
than (5), which may introduce a substantial bias. 


I am very grateful to the National Institute of Social and Economic Research for allowing 
me to call on the help of Miss Joan Ayling in the heavy computing involved in the prepara- 
tion of these tables. Miss Ayling did most of the arithmetical work in constructing series 5-16 
and computing the serial coefficients, and it is a pleasure to record my indebtedness to her. 


REFERENCES 


BartTuLett, M. 8. (1946). J.R. Statist. Soc. Suppl. 8, 27. 

KENDALL, M. G. (1949). Proceedings of the International Statistical Institute at Washington, 1947 
(in the Press). 

MAHALANOBIS, P. C. and others (1934). Sankhyd, 1, 289. 

Orcutt, G. H. (1948). J.R. Statist. Soc. Series B, 10, 1. 

QuENOUILLE, M. H. (1947). J.R. Statist. Soc. 110, 123. 

Wo tp, H. (1948). Tracts for Computers, No. XXV. Cambridge University Press. 

Yue, G. U. (1945). J.R. Statist. Soc. 108, 208. 








T'ables of autoregressive series 


Table 1. Series 5a 





ie. of 





SOMWID MMO 


—_— 
_ 


12 























0+ 50+ 100+ 
57 10 9 
69 26 6 
65 16 34 
80 2 64 
53 25 18 
—6 60 —31 
— 55 49 —77 
—72 3 — 65 
-71 21 —73 
— 86 —5 — 69 
— 68 —6 1 
~—51 32 76 
— 48 29 91 
— 59 53 74 
—81 58 35 
— 99 26 —11 
—31 34 — 24 
5 —19 -1 

66 — 59 —15 
51 —91 — 59 
65 — 65 -—17 
23 -9 —11 
10 60 —44 
44 22 — 90 
82 —16 —47 
95 — 27 —15 
59 2 27 
25 40 -1 
23 32 — 58 
57 14 —51 
76 26 — 29 
75 —13 — 43 
33 16 —14 
30 —2 -—7 
—20 35 28 
— 34 89 27 
— 65 126 — 28 
—21 76 -i1 
39 68 —2 
31 44 — 28 
55 21 —1 
42 —21 32 
21 — 66 61 
—5 — 36 42 
—5 25 19 
— 36 32 9 
—4 13 29 
— 32 —4 5 
—19 21 — 48 
— 24 13 — 24 





150+ 





55 
14 























200 + 250 + 300 + 350+ 
~27 ~67 ~46 ~56 
—53 ~50 ~59 -31 
67 24 ~64 —22 
~86 32 —88 5 
~83 ~14 ~95 57 
~67 ~12 ~29 109 
—43 ~23 ~17 82 
-3 ~53 ~9 57 
66 ~38 ~40 ~~ 
30 7 ~57 ~ae 
22 61 = ~29 
10 75 69 ~ 25 
~16 47 37 ~35 
6 6 46 12 
17 —59 9 60 
59 ~74 14 65 
27 —83 on 40 
~9 ~ 84 ~34 ~13 
~32 2 ~ 50 —83 
~74 -~9 ~58 —58 
-~71 22 ~80 ~19 
~64 ~20 92 —5 
~ 64 ~65 ~89 ~12 
41 14 ~78 ~48 
20 29 ~66 —32 
65 81 ~58 -16 
101 60 ~12 —48 
124 ~18 8 ~79 
62 — 50 ~33 _~97 
—26 ~ 54 ~48 ~56 
-37 ~ 65 ~37 ~52 
— 55 —53 ~31 13 
~22 ~46 0 3 
-2 ~19 14 —33 
17 ~30 2 3 
16 13 44 60 
—19 47 22 42 
4 74 ~21 50 
60 19 ~8l 18 
38 ~31 — 108 ~48 
—35 ~44 —92 —29 
~95 ~50 ~38 -~43 
~72 ~60 34 ~68 
—50 ~87 43 —55 
ae ~83 18 12 
48 | —7) ~23 52 
45 | 12 ~68 10 
| —18 70 ~97 ~ 55 
| —7 | 25 —90 — 37 
| -90 | -25 —58 —59 


| 











Thus, the 242nd term lies in the colurnn headed *200 +’ 


and the row 42, i.e, is — 95. 






























































M. G. KENDALL 271 
Table 2. Series 5b 

yetid , 0+ | 504 100 + 150+ 200 + 250+ 300 + 350+ 
1 —34 —45 —28 111 —16 —29 —23 16 

2 | 10 if ~*~ 37 34 —30 —24 7 
3 67 —15 ~ i 30 1 —61 6 109 
4 43 —48 19 ~~" —— —36 —25 35 
5 | 25 —% 49 —%4 -ll —48 i = 
6 | 33 ~10 51 —-@ —28 — 46 ~35 —86 
7 69 --31 —7 7 ll = —25 —120 
sg | =| «i 6 26 22 —40 —44 —80 
9. | 96 29 18 il 68 —71 -2 — 37 
10 | 49 35 8 23 51 — 50 —25 -—18 
1 | 47 | 28 ~48 —26 <2 — 54 —64 15 
12 | 43 | 17 ~ 90 ~29 -% ~33 —47 25 
13 2 | -i ~63 —23 —e — 28 24 28 
-m | =a |) oan — 46 ~10 —F —58 67 50 
6 | -73 | —-%5 13 l — 67 —19 61 4 
16 ome) | aca 67 ~16 —49 51 —l —52 
17 -40 | 10 9 | -—32 —16 81 —i1 — 85 
18 -30 | 13 43 | 7 9 107 —21 —89 
19 —20 | I —~33 67 ~6 116 24 —101 
20 23 | 23 63 | 67 49 113 4 81 

| | 

21 _27 | 50 aa 4 19 —71 40 22 5 
22 —5l | 33 =——. | = 28 — 103 2 53 88 
23 7 | - 1} | <ae — 65 — 53 82 97 
24 a” SS 32 —53 12 —8l 76 19 
25 56 «| 6-96) sO 39 = 87 24 —75 62 —32 
26 72 | -92 | 9 —35 20 =i —7 ~- 
27 86 | —-48 | —-10 —15 10 —4 — 44 — 28 
28 72 (| 5s | 1 48 —20 5 —23 —24 
29 75 (| 81 | 26 88 at 28 «a ~a 
30 15 | 119 | —% 117 2 57 —?2 1 
31 —17 | 134 | -—69 113 —13 25 57 —6 
32 -27 | qe | —% 96 ~§ =i 88 15 
33 =i ss | =< 50 -_ =" 89 16 
34 46 — 54 | 56 34 —28 is 63 —23 
35 91 —89 | 104 4 —29 ~6 il —69 
36 | 125 -23 | 9 | 26 ~43 ~20 ~49 ~58 
sO 81 = a 55 6 16 —32 —29 —10 
38 16 18 59 -—6 73 —69 = 6 
39 ~6 78 25 31 105 —W 16 8 
40 24 52 —22 21 37 ~ 42 98 
41 64 i9 | m1 —o —6 28 45 —15 
42 eae = o | <9 —$7 76 18 ~ 
48 | -23 | -28 | -23 | -20 — 20 119 —13 2 
44 | 6-65 «| —C-6 -37 | -9 24 62 13 -1 
45 | — 38 45 —59 «| 32 1 1 ~9 ~«$ 
46 | —56 | 70 —79 77 —20 —6 13 17 
47 | —87 11 — 32 —61 24 42 —25 
eo | -#@ | 12 27 — 42 = ar l —79 
-— | «ae 4 40 92 —59 19 —39 —44 —85 
50 | -29 | 22 | = no —41 —4 —81 —24 —36 








Biometrika 36 








272 T'ubles of autoregressive series 


Table 3. Series 5c 









































‘ 
| 
No. of 
parca 0+ 50+ 100+ 150+ 200+ 250+ 300 + 350 + 
1 —26 —55 —20 40 —% 14 27 —8 
2 —20 —43 —21 100 1 a% 42 —53 
3 x6 —37 13 45 32 ps 0 a { 
4 42 -15 45 —45 23 —35 2 1 [ 
5 32 =e = —96 17 —50 ag 19 
6 4% 1 —55 —36 53 — —18 52 
7 12 = tt —25 19 10 —27 12 
8 49 — 67 —26 —49 <— 40 —$i — 36 
9 14 ~ et —33 —25 —30 43 —20 —40 | 
10 —53 16 —29 27 zy 9 2 10 | 
| 
ll —99 67 l —5 9 14 57 —4 | 
12 — 67 55 57 — 36 7 14 95 0 | 
13 —12 —22 102 —10 —$7 20 95 41 | 
14 12 —22 65 25 —78 = 53 92 | 
15 22 — $7 30 18 —59 —53 —26 41 
16 66 —15 0 —30 17 —104 —32 $7 
17 25 — 25 11 | —60 34 — 68 —15 —98 | 
18 =a —2) l SS 72 23 42 —~§5 
19 — 52 nwt 20 13 109 65 85 22 
20 —30 9 —51 10 83 41 82 80 
21 —17 41 —72 =%% 24 21 22 105 | 
22 —3e 3 —29 —20 =o 44 —30 43 | 
23 17 —39 = —52 —24 9 —16 —23 | 
24 36 —78 30 —25 —48 —18 39 -98 | 
25 64 —53 33 = —8T —32 87 — 134 
26 28 — $3 —25 5] —73 = 114 —82 
27 14 —45 —92 42 —9Fi —19 57 —60 , 
28 —15 — 67 —65 33 —34 —24 -18 =} 
29 —44 — 96 8 4] — $8 -—18 — 67 74 
30 —30 = $98 32 39 34 2 —§3 36 
31 — 54 —79 45 25 75 28 —18 30 
32 —72 — 13 58 106 = a= 29 | 
33 — 66 1] —23 28 97 =e —61 is | 
| 34 —173 a» Hi = 37 21 -11 —85 19 | ‘ 
| 35 9% 16 17 7 3 —22 —21 —14 
36 Fi —24 21 —- 36 —22 7 —22 
37 3 6 23 =e 8 —58 41 . 7 
38 38 =n =% —76 9 = 13 16 | 
39 53 —22 —49 —$9 1 3 25 -18 | 
40 47 | 17 —$3 —~ #3 = 8 2 -70 | 
41 22 —16 —48 —¢ — 62 30 22 —63 | 
42 <9 — 69 —28 29 —15 47 57 =$ | 
43 — 62 —31 — 47 = 10 2 79 75 29 
44 —50 4} — 60 — —22 24 63 37 
45 10 93 —47 2} —68 om eS ay 
46 55 84 25 — 48 —26 0 — 93 —79 | 
47 64 64 30 — 67 —41 65 —135 -70 | 
48 92 0 12 —94 —44 97 —93 —43 | 
49 95 —8l — 34 —116 9 30 -11 1 | , 
50 29 —92 ~16 —61 13 24 1 52 | 
| 












































M. G. KENDALL 273 
Table 4. Series 5d 
- 
| No. of e ’ 
Rw 0+ 50+ 100+ 150+ 200 + 250 + 300 + 350+ 
j 
| 
| 1 8 68 —52 63 18 —13 69 —9 
2 24 117 —% 39 il —38 22 27 
on 21 81 —49 23 18 —64 —52 —5 
4 38 TF = 8 4l =o —52 —61 
5 45 —F2 60 —33 77 —24 17 — 52 
a5 4 —9 —29 49 11 51 18 2 20 
7 7 a —42 22 47 | 29 73 — 26 66 
8 —§ —55 18 9 16 67 —i 54 
9 — 25 8 51 —60 | —-21 38 =% 6 
| 10 21 68 20 —96 9 —18 —5 29 
| Wl 36 59 —12 — 36 17 — 57 —23 ll 
12 0 5 —10 30 —19 —27 6 —il 
13 —57 23 =< | 76 — 56 —1i 18 17 
i4 —52 28 — 62 3s i -#=8 44 —19 5 
15 —49 10 —7 | D> i <= $7 =a 18 
16 5 39 —90 | —852 3 121 44 —12 
17 54 60 —§3 | —1%) 29 79 54 —26 
18 74 14 2 | -97 | 37 24 72 -8 
19 56 19 45 — 88 | 0 — 16 49g —15 
2 63 —26 78 -32 | 1 —9 48 1 
2) =% 4 33 39 | 8 —40 39 —il 
22 —42 —__ 42 = | = —52 —18 35 
| 23 = 2 24 —32 12 —52 | —30 70 
| 24 —89 51 — —56 11 —o- | 20 7 
25 — 86 23 7 —28 17 —— | 86 22 
26 =o 32 19 —28 | a —e 116 -—9 
27 16 67 22 —5i1 | —I16 —% 109 —58 
28 56 34 27 a | — 0 19 —32 
29 77 —i 7 —19 == 56 ~—9 23 
30 19 —45 —16 16 — 33 64 =8 4 
31 — 43 oe —13 — —36 =} — — 24 
32 —22 —60 25 — 24 —39 — 46 4 14 
| 33 28 G3 ef 22 —42 —79 =—% 64 
| 34 86 3 65 | 20 —24 — 102 3 59 
35 117 1 4 | 29 —20 —84 —39 | -—9 
36 108 9 68 | 53 31 —29 — 64 — 36 
| 37 103 50 48 | 67 17 57 => | —-_ 
| 38 53 7 4 | 47 —39 94 CC 33 | -58 
39 -3 — 23 12 —4 —3 104 | 58 | -—62 
| 40 —29 — 66 —14 | — 68 23 115 | 89 | 5 
41 — 57 —90 — 55 —45 21 41 37 | 11 
42 —92 — 20 — 25 — ie —12 ae 20 «| —3 
| 43 — 60 3 =% — 8% —44 —67 = 10 
| 44 4 58 97 11 -—i —88 2 —25 
45 14 33 —29 34 24 —76 — 45 o> 
| 46 53 —36 — 55 41 24 —42 —73 — 46 
| 47 11 —73 ae 18 36 —a9 —96 —13 
48 33 —96 — 40 i 25 = —70 —1 | 
49 —18 —95 29 14 | 50 49 —16 —20 | 
50 12 — 50 62 6 | : 4 99 18 —70 
i 



























































2 


74 























No. of a 
term O+ 50+ 
1 46 =e 
2 ae 55 
3 27 119 
4 —19 78 
5 17 38 
6 52 — ee 
7 51 — 35 
~ 47 8 
9 =§ 71 
10 =i 40 
1] — 50 — 50 
12 —-25 | —112 
13 30 | = —65 
14 m=: 35 
15 24 | ay 
16 a | 62 
17 —122 | 2 
18 ee eee 
: 2 47 —F6 
| 20 44 — 36 
21 — 22 54 
| 22 0 127 
| 3 9 106 
24 6 — 29 
25 — — 14} 
26 — 48 — #17 
27 —72 —38 
28 a 55 
29 26 96 
30 87 60 
31 52 aa 
32 —24 —8 
33 — 20 —48 
34 — 24 —90 
35 —12 -100 
36 ae | a8 
37 101 | 38 
38 37 | 36 
39 —4 | 7 
40 —34 | 16 

| 

41 = 1] 
42 10 | 17 
43 -_— | ae 
44 go | —71 
| 45 86 | —40 
| 46 39 CO 12 
| 47 =49 24 
48 —79 2 
49 — 126 22 
50 — 120 22 





Tables of autoregressive series 








Table'5. Series 6a 









































100 + 150+ 200 + 250+ 300 + 350+ 
hi | | 
—16 -5 147 98 | 4s | —21 
-71 54 121 42 | 1 | —-14 
— 96 80 —13 — 36 —-10 | —-22 
—90 15 —83 — 105 —36 | 0 
11 —42 —97 — 134 —26 | 51 
129 —44 —78 —42 —45 | 29 
93 23 —13 70 —56 | 42 
46 104 26 108 -28 | —64 
— 28 70 —5 62 58 | il 
—79 0 —%3 17 87 | 95 
} 
—52 —16 —93 5 84 | 48 
—21 — 30 — 12 5 | 38 —36 
55 =a 78 —15 | 17 —101 
89 | 19 61 -62 | -13 | -79 
47 | 86 —42 =t4 | 9 1] 
—37 39 —103 21 | 2 124 
—114 —5] — 86 67 | -53 | 170 
—93 | #7 —24 93 —100 | 105 
18 | —2i 30 67 —-66 | -—87 
75 | —2 29 —43 | 32 | -111 
| | 
32 | 48 —37 -55 | 40 | —56 
a 82 — 20 —38 | ll | 46 
-—3s | 8 ~13 —-40 | -27 | 128 
—s | =—80 —10 —37 | —4 129 
—309 | —33 — 26 24 | 18 | 12 
-30 | 29 21 84 ~9 71 
— 33 14 67 92 — 57 — 107 
— 32 — 36 19 30 -98 | —-16 
—4 — 85 11 —— | = | 100 | 
17 — 63 44 — 132 a 122 | 
18 40 0 —69 143 8 | 
42 13 ~71 30 142 —iis | 
32 151 — 47 76 53 —179 | 
—42 87 —6 98 — 28 —61 | 
—111 —39 23 25 — 105 37 
—113 —118 —15 ~ 55 —70 85 | 
—19 —121 —65 — 62 —§ 76 | 
115 ~10 ~—19 — 28 94 26 
142 91 68 —9 65 13 
64 100 101 —30 36 1 
— 44 10 10 —75 15 — 36 
— 148 — 58 —24 —59 | 36 —15 
— 100 —44 —49 —49 | 55 3 
26 18 5 —2] 10 23 
118 31 22 44 — 28 16 
98 —27 ~19 28 | —-39 —26 
—12 — 84 9 -40 | -16 5 
—71 — 120 11 —46 | 19 | -—8 
— 62 —22 34 —2 6 | 13 
— 58 102 70 54 -8 | 23 
| 











M. G. KeEnDALL 275 - 


4 | Table 6. Series 6b 



































+ No. of 
piace 0+ 50+ 100+ 150+ 200 + 250 + 300+ 350 + 
l = 
4 : 34 —51 ~61 69 =" 19 - 64 
2 Sy 15 — 84 — 64 75 — 63 —35 35 3u 
0 hk ~94 ~16 ~21 —52 ~73 83 ~22 
I 2-4 -— tae 49 —~ 106 38 —88 33 —26 
9 5 | 55 | 45 105 — 65 47 — 23 —72 —10 
2 6 | 60 | 126 | 34 —20 17 18 — 149 36 
4 ae 49 a 4 =e 16 28 23 —92 59 
l 8 | 31 —-35 | -—170 37 il 30 —24 38 
5 * | -s — 165 33 22 —49 —29 81 39 CO] 
10 | -27 — 124 75 —2 —26 ~16 85 -20 | 
° 
B ll | —38 21 23 = = 6 19 —6 
| a 24 168 —s | —3 8 18 i % 41 
13 | 18 187 —50 —26 40 59 —90 49 
14 | -12 69 =e —9 54 89 —49 47 
{ Si) ae = 16 54 67 78 33 46 
) | 16 | —42 ~ 104 21 78 ~5 6 108 49 
5 | im. | — 108 7 67 —90 —40 66 50 
] | a2 | —12 ae 19 ——_ —141 —45 —2 ~-29 
| | 24 39 12 —i5 = — 126 ay —1i2 —95 
| 20 83 —§ — 66 —42 —9 44 24 —78 
) 
21 63 —8 — 62 27 42 99 42 —33 
22 24 —49 34 98 63 70 —10 —8 
) | 23 —58 =i 45 53 — 20 —89 et 
| 24 —— —29 52 —53 —86 <a — 133 59 | 
| 95 —24 2 | = —98 = —37 422 | 
‘oe El ag —14 — 66 —21 —6 —4) 38 30 | 
) | -_ —4 4 — 66 78 42 —78 58 36 CO 
= | 2% | 48 44 — 28 145 18 —37 —13 —38 | 
Ef | 29 = Fe 94 13 117 18 28 —27 —-44 | 
| 30 2 88 59 7 —ae 30 ll —14 | 
31 103 69 63 — 108 —21 52 53 9 
| Se 4 99 53 —9 =O all 44 92 42 
| 33 | -20 47 —71 52 — 26 5 38 6 
| 34 | —134 14 —88 141 —46 11 —60 —61 
| 35 | —135 27 ar 102 —12 31 -—112 —50 | 
| 36 | —65 12 64 22 34 74 — 59 ae 
aa 23 —54 16 — 68 22 86 63 2 | 
38 | 114 —44 11 36 15 81 160 34 | 
39 CO 62 33 —30 47 —28 65 77 51. | 
40 15 91 —79 49 —29 1 —89 61 
4) = €3 74 —107 127 —22 —92 — 152 18 
a | —16 22 —34 116 25 —59 —75 —48 
a i — 1221 — 82 l 64 48 —20 87 —86 
- i =< -61 12 an 12 65 153 —44 
45 80 16 12 —2a7 —61 77 75 —25 
46 94 34 — | —4237 — 125 —oe —32 anil 
47 27 63 om | 1 —92 — 126 —46 53 
48 — 25 48 —22 68 19 — 142 —$ 59 
49 —5 —39 33 119 76 nee —§ ¥ 
:a 50 —25 —52 43 | 91 38 2 15 —36 















































276 


Tables of autoregressive series 


Table 7. Series 6c 






































posting 0+ 50+ 100+ 150 + 200 + 250 + 300 + 350+ 
1 —29 =a 3 —95 10 —67 49 —2) 
2 ~= §8 — 5oe — 65 — Fi} 45 42 53 — 26 
3 —9 — 20 | —29 51 37 59 —59 
4 8 91 42 51 38 56 —38 — 30 
5 —29 103 43 66 — 28 26 — 26 — 68 
6 l 45 7 42 —91 8 — 64 — 23 
7 5 —26 —53 —10 — 652 = £6 — 54 —2 
8 35 —110 — 108 —2) 50 — 27 17 = 19 
9 34 —112 947 20 114 47 51 = 
10 8 — 54 20 72 99 9} 29 1 
ll 30 —16 45 52 59 82 — 55 67 
12 69 9 9 —20 1 2 — 9M 85 
13 85 29 — 59 $8 — 65 —94 —109 81 
14 —12 32 — 97 — 36 = %% —96 and 26 
15 —89 4 3 28 — 64 16 131 1 
16 —131 =— 97 42 102 — 32 54 122 = $i 
17 = §7 0 90 109 60 24 5 7 
18 47 —4) 35 36 85 — 30 — 60 —§ 
19 106 =—7 4 — 60 64 —48 — 82 = 23 
20 47 = 19 — 108 — 40 —7 — 62 —24 
21 — 65 — 37 52 —29 — 106 12 41 6 
22 —97 — 54 20 93 — 122 19 54 62 
23 —89 ~% = 56 109 —49 45 3 93 
24 — 69 68 — 48 44 23 18 =e 98 
25 7 71 — 60 6 97 21 , 4 
26 56 27 — $7 —4 124 47 8 —116 
27 43 -- §} 55 —40 16 —3 50 — 136 
29 34 —29 24 —4¢ — 105 —8l 35 —58 
29 —33 55 —§ 35 — 157 — 126 33 62 
30 — 26 108 — $8 59 —55 — 114 —4) 137 
31 —4 84 —43 65 lll —i} — 26 55 
32 36 39 —27 -9 214 98 ~ 59 0 
33 63 —24 44 —29 173 83 51 — 26 
34 68 —87 104 16 43 46 82 5 
35 24 —43 39 15 —49 —14 29 49 
36 —42 39 — 37 64 —94 —32 7 51 
37 —76 45 ~ 87 —27 — 65 — 42 —14 46 
38 — 25 39 —27 — 104 — $8 28 = —19 
39 54 53 — 46 —77 4] 67 28 — 60 
40 128 —18 — 68 —43 32 78 73 —19 
41 124 —32 3 —'3) 25 —13 106 27 
42 —3 =—8 66 —25 — 26 — 108 89 | 87 
43 — 122 20 103 21 —45 —69 43 | 71 
44 — 166 7 44 23 — 37 31 19 | 0 
45 — 120 9 — 80 0 ll 105 -50 | —658 
46 —22 24 —101 —§ 69 125 -94 | —108 
47 120 23 —63 —20 60 10 — 36 te. 86 
48 176 21 15 25 —30 — 68 = § — 49 
49 48 38 49 19 —94 —45 | —2 
50 —98 28 —6 —16 — 100 —23 -9 | -1 











oa 


| 


6 


m—wOmwWNwWadg 


wm OOO sd ie ee CL mm Ord 


NNDOG Yr OWNS 


So woorwoccc 


m Nw CMe ess 














| 


























M. G. KENDALL 277 
Table 8. Series 6d 
No. of } 
pisceiaeg 0+ 50+ 100+ 150+ 200 + 250 + 300 + 350+ 
1 =~ ST 88 118 —178 —16 —57 7 —35 
2 — 54 88 106 25 —26 -11 42 —58 
3 —41 73 50 68 —32 in ¥ 64 —85 
4 26 22 16 30 —% 23 3 $5 
5 27 —26 26 33 12 51 —20 12 
6 1 on 39 — 60 27 73 —24 46 
7 12 = 38 44 —19 38 —8 —19 35 
8 13 —47 —2 66 32 —113 46 29 
9 ~§ —30 1 126 52 — 159 87 —27 
10 9 —20 —36 104 29 —36 79 —8i 
ll 1 —26 —35 43 15 40 —22 —20 
12 wf —12 —10 «<a 21 42 —101 17 
13 45 3 46 —98 46 —26 —90 6 
14 77 42 10 = 54 —61 —52 22 
15 53 43 —2) 5 ml —25 59 49 
16 17 48 — 42 68 —15 —20 81 19 
i7 25 32 — 52 55 —52 23 29 = 
18 18 37 =f —18 —14 67 —16 —3 
19 —26 40 41 —26 44 9 —5 s 
20 —62 8 38 —44 27 l —36 = 
21 —84 —67 19 «4 Qh 6 8 8 
22 —he —52 — 28 68 -111 1 8 48 
23 70 —20 —36 42 —92 8 6 65 
24 106 —30 —45 17 n% 26 =% 80 
25 94 — 22 — 62 7 60 nf i 81 
26 63 48 —24 =f 114 19 =—§ ~i6 
27 4 101 18 —23 85 62 —40 = 
28 —89 46 22 — 52 1 88 —72 =F 
29 102 —53 54 —52 —103 81 —46 —65 
30 =18 —76 32 = 93 — 162 8 —20 —51 
31 96 6 6 —21 —110 —61 ~~ 18 
32 96 38 —44 —44 31 —78 47 99 
33 37 21 —53 —75 118 —86 16 140 
34 —28 —21 —33 —5 lll —40 —20 92 
35 —61 6 a 27 —13 -—8 —25 2 
36 —85 37 14 —8 —153 54 — —%8 
37 me ~i3 30 —43 — $72 41 41 —63 
38 74 = 26 «$8 —45 —28 62 30 
39 96 —99 — $7 13 98 —24 78 58 
40 71 —94 — 46 88 134 —20 80 41 
41 14 —22 —58 59 108 -11 49 43 
42 —7 69 —170 —46 10 46 37 61 
43 25 87 19 —53 —60 59 47 28 
44 a" 45 72 —69 —119 57 27 —<47 
45 22 —68 57 74 —42 16 —10 -61 
46 50 —8l 47 — 67 55 14 —25 —58 
47 17 —65 39 17 109 =ié —28 32 
48 —53 — 26 —16 43 60 5 37 130 
49 — 52 0 —91 =i 25 —10 82 91 
50 6 66 —98 —32 —26 19 49 3 
ee a ae p= 











# 














278 


Tables of autoregressive series 


Table 9. Series 7 


















































































pied 0+ 50+ 100+ | 150+ | 200+ | 250+ | 300+ | 350+ | 400+ | 450+ 
1 | —0-060| 0-666} —0-638| 1-196| 1-070] —0-267| 0-593] --0-726| —2-320| —0-871 
2 | -0-581| 1-744] 0-390] 1-888] 1-588] 0-457] —1-914] —0-345| — 1-457] —0-826 
3 1:517| 1-831] 0-718} 1-370] —1-501| 0-847] —2-208] 1-438] —1-558] 0-388 
4 2-200} 1-660] —0-181| 1-472] 0-369] —0-163] —1-316| 3-073| —1-738] 1-539 
5 2-210| 1-158} —0-933| 3-308] —0-565| —0-627| —3-207| 1-964] —0-535| 2-453 
6 1:513| 0-201] —0-870| 2-660| —2-778| —1-021| —2-072] 2-521] —1-119 | —0-032 
7 1-519} 0-967] —0-156| 2-195] — 3-380] — 1-229] —2-587| 1-461 | —0-884| —0-085 
8 1-156 | —0-084| —0-428| 1-974! — 1-469] — 2-485] —2-661] 1-322] 0-314] 0-778 
9 2-710} 0-600} —1-400| 2-882} 0-090] —0-924] —2-419| —0-400| 0-261| 2-117 
10 1-421 | — 1-186] —1-017| 4-137] 1-250] 0-375} —2-857] —0-422] —1-092] 1-108 
| 
11 0-197 | —0-960| —0-796| 4-284] 2-660} 0-381] —2-859] —0-501|} —0-420] 0-236 
12 0-762; 0-291] —0-574| 3-395} 2-309] 1-351! —2-526| —1-049| —0-429] 0-302 
13 0-774| 0-818] —0-576| 1-606] 0-322] 2-567! —3-180| —0-095| —1-507] 1-369 
14 | —0-429) 2-450] —1-157| 0-909] 0-735] 3-124|] —3-628| —0-175| —0-230| 0-516 
15 | —1-001| 1-632] —0-970| —0-456| 0-804] 2-795| — 1-725] — 1-338] —0-976] 1-227 
16 — 1-505 3-047 | — 1-761 | —0-061 | -- 0-463 2-705 | —3-168 | —0-590 | — 1-329 2-700 | 
17 — 0-686 2-402 | — 1-200| —0-182| —0-782 2-730 | —3-724 0-443 | — 1-928 3-006 
18 — 1-027 1-517 | —0-892| —0-567 | — 0-453 2-860 | —4-270)} — 1-260} — 1-959 3-701 
19 — 1-518 2-550 0-920 | —0-180} —0-916 3-655 | -— 4-656) —0-611 0-487 3-070 
20 | —1-732| 1-437] 0-704] 0-098] 0-466] 2-883} —5-368| —1-042| 0-382] 1-421 
21 | —2-439/) 1-693] —0-279| 1-630] 0-572] 3-920) —5-492| —0-932] 0-221] 1-348 
22 — 1-696 2-455 | — 0-694 1-039 | — 0-208 4-688 | — 5-624] — 0-868 | —0-737 | — 0-587 
23 | —0-791| 1-207| —0-879| —0-307| 0-707} 3-837| —3-301| — 1-788] —1-310] —0-459 
24 — 2-302 0-219 0-840 | — 1-326) —0-834 4-721 | — 5-446 0-487 | —0-404 — 0-667 | 
25 — 2-009 0-204 0-564 | — 1-243 | — 1-002 2-830 | — 5-222 0-907 | — 2-243 | — 2-541 
26 | —3-029| —0-222| —0-066| 0-119] —1-:140| 4%-901| —2-483| 0-802] —0-762] —0-844 
27 — 2445 0-350 2-557 | —0-874| — 2-041 &-371) —2-210 0-198 | — 1-472 | —0-562 
28 | —1-945| 0-059/ 1-643) -0-769| —2-802| 4-196] —2-266| —0-883| —0-560 | —0-322 
29 | —0-863; 0-621} 1-857] -2-196| —3-613| 4-025] —2-122| —1-375| —1-942] —3-020 | 
30 | —0-970| 0-727| 1-845| — 1-588} —3-191| 3-240] —1-132| —2-324| —2-898 | — 1-927 
31 0-290) 0-272] 3-143) —2-759| —3-016| 3-521 | —0-628| —3-390| —2-032| 0-157 
32 | —0-336| 1-699) 3-554] —2-583] —2-369| 4-316| 0-789| —2-437] —1-343| 1-206 
33 1-065 | 1-840) 2-232! — 2.503 | —2-386| — 0-016] 0-235 | — 2-506 | — 1-413}, 0-900 
34 | -O-119| 2-031] 2-117 | — 0-830] — 2-455) —0-911 | — 0-012} —2-273| —1-632| 3-060 
35 1:189| 2-443| 2-630/ 0-726| —2-989| —1-524| — 0-366] —0-022|) —1-451| 2-774 
36 1-595; 3-005) 2-868) 0-070] — 1-781 | — 1-730] —1-559| 0-643 | — 1-438] . 1-206 
37 1:717| 3-491) 2-357| 0-616) —1-725| —1-512| —2-165] 0-323] —1-792] 2-438 
38 2-729) 1-871| 2-747) 0-315! —1-208| —3-368| —2-913] —0-273| —3-068!| 1-523 
| 39 2-827| 2-177| 2-955) —0-533| —1-477| —3-442| —3-367| —1-836| —2-774| 1-183 
| 40 2-764 0-790 1-470 | —0-148 | — 2-403 — 1-018 | —3-359 | — 1-909 | — 1-105 0-046 
41 2-198} 1-113/ 1-915) —0-782| — 1-594] —0-493| —2-111| —0-843| —1-297| — 1-493 
42 1:975| 0-240) 0-587| 0-063) —1-777| 1-487| —3-012| —1-482] 0-207] —3-627 
43 0-806 0-219 0-549 | — 0-260 | — 2-580 0-369 | — 3-768 | — 2-331 | —0-611 | —2-872 
44 0-179 | —1-257| 1-425] —0-789| —3-576| 1-709} — 4-300] —1-529] —0-098 | — 1-473 
45 | 0-458 | — 1-836 1-062 0-014 | — 3-619 0-459 | —3-123| —3-324 1-713 | —1-400 
46 0-673 | —1-174| 0-737] —2-347| —3-398| —0-109| —2-s64| —3-234| 9-422) —0-837 
47 0-290 | — 1-028 2-308 — 1-584 | — 3-558 0-945 | — 1-828 — 3-137 | 1-694 | — 1-626 
48 1-710) —0-752| 2-245) —2-549| —2-666| 1-447] 0-018] —3-296| 1-604] —1-954 
49 1-144) —0-989] 1-750| —3-155| —1-126] 2-865] —1-978| —0-889| 0-154! —0-386 
50 0-616 | —0-261| 1-813) —0-893| —0-485 384) — 1-014] — 1-949] — 1-390 | —0-057 
= a — a. 








The 501st and 502nd terms are 0-175 and — 1-299 respectively. 











M. G. KENDALL 279 


Table 10. Series 8 














ZaQ— 


SnNoaonww eo 





MM we Se WY Co 


—— 


_— ee 









































| 
50+ 100+ | 150+ | 200+ | 250+ | 300+ | 350+ | 400+ | 450+ 
| 
| | 
| 3-81 1-20 | 3-54 2-20 2-59 | —3-98 | 0-82 | —3-62 | —2-44 
| 4:38 1-12 3-40 1-57 1-93 | —605 | 463 | —3-84 | 0-83 
4-07 | —0-30 | 5:27 0-06 | 20 | —7-88 | 6-65 | —2-95 4-59 
2-49 | -1-76| 676 | —3-50 | —1-77 | —7-71 | 7-52] —2-44 | 4-60 
1-67 | —1-94 | 7-00 | —7-26 | —3-27 | —7-13} G41 | —2-:10| 2-68 
0-51 | —1-69 | 6-29 | —7-70 | —5-20 | —6-65 4-61 | —0-77| 1-43 | 
0-32 | —2:28 | 630] —4-76 | —5-01 | —6-17 1-47 0-46 | 2-35 | 
—1-08 | —2-68 | 7-92 | —0-13 | —2-53 | —6-32 | —1-11 | —0-20 | 2-98 
~2-31 | —2-61 985 | 490} 0-10 | —672 | —2-46| —0-87| 2-34 
1 —1-71 | —2-10 | 10-27 7-76 2-73 | —6-76 | —3-20 | —1-29 | 1-38 
0-09 | —1-58 7-98 6-41 5-52 | —7-26| -238 | -249| 1-72 
3-41 | —1-85 4-55 3-91 7-83 | —8-23 | —1-20| —232 | 1-72 
5-33 | —2-21 0-56 1-90 8-65 | —7-15 | —1-46 | —2-29 | 2-26 
7-21 | —3-27 | —1-72 | —0-33 8-30 | —6-92 | —1-60 | —2-68 | 4-32 
7-67 | —3-69 | —2-35 | —2-09 7-54 | —7-76 | —0-59 | —3-74 6-63 
6-35 | ~3-32 | —2-30 | -259| 7-00 | —9-35 | —1-10 | —4-73 8-84 
5:70 | —0-88 | —1-53 | —2-72 7-59 |—11-06 | —1-53 | —2-84 9-47 
4-53 1-39 | —0-44 | —1-23 7-73 |—12:86 | —2-18 | —0-38 7-42 
3-83 1-69 | 1:92! 0-58 8-63 |—14-11 | —2-56 | 1-22 4:78 
4-40 0-47 | 3-36 1-04 | 10-31 |—14-71 | —2-59 | 0-80 0-96 
| | 
4:13 | —1-21 | 244] 1-57 | 10-87 |—12-483 ] —3-36 | —1-04 | — 1-80 
2:57 | —0-72 | —0-33 | 0-37 | 11-52 |-11-76 | —1-91 | —1-95 | —3-12 
0-96 0-37 | —2-82 | —1-38 10-07 | — 11-95 0-48 | —3-87 | —5-08 
— 0-45 0-70 | —2-82 | —2-84 8-22 | —9-74 2-29 | —4-04 | —4-87 


—0-40 4-75 | —218 | —631 | 10-40 | 


— 0-62 3:15 | —2-57 | —4-48 | — 6-95 2-48 | —3-98 | —3-38 
0-49 5-51 | —3-31 | —8-31 10-78 | 
| 
































1:47 5-53 | —4-14 |] —9-18 9-90 3-22 | —4:70 | —4-92 | —4-53 
1-64 6-47 | —5-66 | -—896 | 902 | —2-08 | —7-64 | —5-86 | —3-28 
2:77 7-91 | —6-74 | —7-63 9-29 0-12 | —849 | —5-33 | -O-14 
| 
4-09 7-69 | —7-08 | —6-30 5-69 1:40 | —8-02 | —4:35 | 2-39 
5-14 6-63 | —5-25 | —5-57 0-71 1-47 | —6-85 | —3-75 5-76 
6-06 6-07 | —1-:51 | —5-97 | —3-59 | 0-55 | —3-55 |\-3-40 | 7-91 
7-09 6-23 1:03 | —5-56 | —6-04 | —1-69 0-17 | —3-30 7-03 
8:27 6-18 2-51 | —4-86 | —6-35 | —4-30 2-28 | —3-73 | 6-22 
7-42 6-43 2-56 | —3-77 | —7-34 | —6-80 2-15 | —5-52 4-84 
6-20 6-93 1-03 | —3-20 | —8-34 | —8-69 | —0-61 | —6-98 3-40 
3-90 5-88 | —0-30 | —4-03 | —6-52 | —9-52 | —3-65 | —6-02 1-37 
2-31 4-92 | —1-62 | —4-43 | —3-50 | —8-24 | —4-56 | —4:43 | —1-69 
0-82 3-06 | —1-57 | —4-64 0-90 | —7:31 | —4-67 | —1-66 | —6-17 
| 
—0-03 1-45 | —1-18 | —5-46 3-11 | —7-69 | —5-19 | —0-22 | —8-81 
— 1-70 1-49 | —1-30 | —7-27 4-68 | —9-11 | —4-90 0-49 | —8-08 
— 3-69 1:98 | —0-83 | —8-88 4:05 | —9-29 | —6-12 2-36 | —5-88 
— 4-39 2-17 | —2-61 | —9-53 2-01 | —8-33 | —7-52 4:77 | —3-27 
— 4-01 3-70 | —4-:04 | —9-60 1:13 | —6-35 | —8-34 5-77 | —2-28 
—2-97 | 5-23] —5-69 | —8-46 1-68 | —2-80 | —8-72 5-56 | —2-83 
—2-25| 566] —7-39 | —5-63 4:15 | —1-88 a — 3-39 | —2-36 
—1:25| 642] -—618 | —2-45 6-01 | —1-69 —0-45 | —1-24 
—0-89 | 433] —2-03| -—0-15 5-13 | —1-64 | a 7 —3-05 | -—0-01 
0-04 3-94 2-44 1-52 |. 0-72 | —1-30 | —3-75 | —3-96 | —0-69 
Fae 4b | —_— 














280 


Tables of autoregressive series 


Table 11. Series 9 





















































The 501st and 502nd terms are 0-950 and 3-505 respectively. 














| 
No. of 0 | 
Seniie + 50+ 100+ | 150+ | 200+ | 250+ | 300+ | 350+ | 400+ | 450+ 
| 
1 2-390 | — 1-093 | —0-390 | — 1-528 1-023 | — 1-549 | —0-093| —0-119| —0-069 0-177 
2 0-985 | — 1-147} —1-119| —1-507| —0-098 | —2-845| —1-162| —0-789| 0-054| —0-044 
3 | —0-655| 0-472 1-477 | — 1-378 | —0-499| —0-766| 0-734) —0-560| —0-903 | —0-493 
4 | —0-679| 0-995) 3-374) —0-164| —0-493| —0-598| —0-108|} —2-073| —0-603 1-064 
5 | —0:044/ 0-400; 3-173] 0-405! 1-325) 0-223] 0-755| —2-348| —0-691/ 0-688 
6 | —1-457| —0-838| 2-029 1-080 | —0-915| —0-475| —0-022| —1-104| —1-737| 0-558 
7 | —0-731|} —1-760; 1-895| 1-358 1-159 | —0-521 1-113} —1-271 | —0-908 | —0-235 
8 | —0-724| —1-660| —0-669| 0-847| 0-391| 0-798) —1-301| —0-404| 0-731] —1-151 
9 | —1-567| 0-058) —1-578| 2-333) —0-093| 1-567 1-160} 0-481 1-316 | — 1-803 
10 | —1-654| — 0-226 Oar | 2-121! —1-850| 0-929 1:397| 0-588; —0-432| —3-082 
11 | —2-416| 0-303; 0-875| 0-822) —1-059 1-927 1-857} 0-865| 0-241| —1-691 
12 | —2-821| —0-742| —0-377| 1-245] —1-014| 0-203 1-502 1-798 | 0-387 | — 1-467 
13° | —0-701 | —2-360| 0-529| 2-618} —0-112| —1-467| 0-882} 0-663] —0-414| — 2-607 
14 | —1-515| — 1-644) 2-016 | 3-134| 0-992) —0-550 1:052| 0-519] 1-696) —2-160 
15 | —2-112| —1-397| 1-186) 1-582) 0-913 | —0-411 2-911 | —0-064| 1-998) —1-505 
16 | —1-602; —1-792| —0-430| 1-099! 2-637 1:384| 3-643) 0-133] 0-263) —0-722 
17 | —1-805| 0-826) —0-832| —0-230 1-982} 0-591 1:186| 0-441 1-270 | —0-876 
18 | —1-624) 1-679} —0-309| 1-113 1-315] 0-809 1-862 | —0-974] 0-485 1-045 
19 | —1-:060; 0-732) —1-515| 0-289! 1-054; 1-:306| 0-112) —2-054| —1-974 0-950 
20 | —0-022| 1-093| —1-868| —0-153| 0-423} 3-434 1-032} 0-386} —2-867| 0-683 
| 
21 0-546 1-058 | — 2-501 | — 1-891] —0-181 3-421| —0-491| 0-826] —2-328 —0-594 
22 | —0-886 1-536 | — 2-059 | —2-573| —0-464| 2-275 1-032 | — 1-679 | —1-682 —2-596 
23 | —1-321 0-244 | —1-121) —1-459| —0-476| 2-355] 2-384) —1-:119| 1-102) —1-126 
24 | —1-014| 0-210) —2-314) 0-245 1-183 1-366| 0-624) —1-946| 2-350| —0-665 
25 | —2-254| —0-694| —3-101| —0-027| 0-973 1-552 | —0-917| — 1-392 1-344 | —0-823 
26 0-582 | — 1-028 | — 2-921 1-390| 0-801 | --0-026| —1-363| —0-476 1-133 | —0-469 
27 0-272 | —2-704; —1-972| 0-253| —1-514| 0-540/ —1-593| —0-341 1-642 | —2-195 
28 0-358 | — 1-164} —3-015| —1-583| —0-455| 9-352) —1-606| —0-470| 0-800/ 0-448 
29 0-981 | —1-382| —4-109| —1-577| 0-204! 0-665| —3-104| —0-384| 1-977] —0-573 
30 | —0-497| —1-815| —2-635| —2-094| —0-132| —0-531| — 3-599 1-604] 0-233) —0-957 
| | 
31 | —1-078| 0-006| —1-375| —0-473| 0-512| —0-877| —1-665| 2-262| —0-732| 0-114 
32 | —0-318 1-277 | —2-038 | —0-485| 0-550) 0-195 1-684 1:352| 0-864| —1-102 
33 | —0-597 1-496 | —0-713| —2-421| 0-717) 0-565} 2-532) 0-700| 0-571 | —1-724 
34 1-697 1-785 | —0-523| 0-261} 0-132) 0-470| 0-471 1:276| 1-853) —1-961 
35 2:585| 1-035) —0-443| 1-019) 0-400; 0-352) —1-502 1-128} 1-589! —1-539 
36 0-170 1-615| —1-932! 1-259) —0-241| 1-325| — 1-763 1-682| 0-980) —1-111 
37 0-497 1-734 | —1-592| 2-917) 0-136) 1-383| —2-083| — 1:061| 1-530 —2-282 
38 0-437) 3-754) 0-252; 1-018} 1-115) 0-956) —2-674| 0-442) 1-110) —0-783 
39 1-554) 1-252) 0-527) 1-601 | — 0-044 | 1-191 | —2-365| 0-482) —0-024| —0-421 
40 1-474; 1-540) —0-438 | 0-989 | one eae ~2-846| 0:395| 0-693) —2-011 
| | | | 
41 | —0-809| 0-808) 0-644| —0-188 | 0-503} 0-978 ~ 4-160 0-722! 0-102! — 1-903 
42 | 0-240 1-408 | 0-095) —0-910| 0-215) 0-707) —3-199| 0-477| 1-465) —2-617 
43 | 1-088; —0-382| 0-452) 0-166 —0-392| —0-319| —2-662| 0-646) 0-086) —2-433 
44 | —0-609| —0-069| 0-955) —0-445; 2-271) 0-747) —4-524 1:893| 0-255) —2-710 
45 | 0-440 | — 0-179 1-307 | —0-126| 3-453 | —1-327| —3-303| 2-771) 0-186) —2-601 
46 | 0-486 | —1-966| 1-319) —1-096| 2-055 1-291 | — 1-198 1-159 1-571 | —2-670 
47 | —0-523 1-004 | —0-085| —0-008| 2-397! —0-848| —0-765| — 1-065 | — 0-233 | —0-866 
48 | —1-:337) 0-701) —0-271| —0-021| 2-095) —0-142| 0-347) —2-119| —0-151 | —0-016 
49 | —1-116 1-259 | —1-908]| 2-390 1.924 | —0-749| 0-432) —1-117| —0-485)| 0-940 
50 | —1-675 1-385 | —1-725| 2-176) 1-331 0-167; 0-961 | —2-044| 0-596| 0-001 
| | | | | 








a ee ee OE LUND OI © GO OI GS GC. if 


a a 


M. G. KenDALL 












































281 
Table 12. Series 10 
No. of i ss 
Sinus 0+ 506+ 100+ 150+ 200+ 250+ 300 + 350+ 400 + 450+ 
1 — 0-767 | — 1-484 0-004 | — 4-339 0-422 | — 6-040} — 0-388 | — 1-455 | —0-607 | —0-268 
2 — 2-015 1-188 3-435 | — 2-353 | — 1-673 | — 4-886 0-067 | —3-827 | —0-394 0-644 
3 — 1-877 2-449 6-949 | —0-014 | —0-726| —2-132 1-022 | — 5-830 | — 0-820 1-531 
4 — 2-514 1-262 7-956 2-241 | —0-877 | —0-377 1-069 | — 5-604 | — 2-443 1-920 
5 — 2-558 | — 1-597 7-172 3-830 0-557 0-130 1-778 | —4-520 | — 3-185 1-111 
6 — 2-281 | —4-047 3-242 3-940 1-443 1-130 0-120 | — 2-574) — 1-551 | — 0-888 
7 — 2-797 | —3-596 | — 1-598 4-751 1-215 2-745 0-403 | —0-091 1-202 | —3-336 
8 — 3-590 | — 2-158) —3-835 5-378 | — 1-235 3-383 1-781 1-776 1-666 | — 6-307 
9 — 4-967 | —0-273 | — 2-545 4-362 | —3-025 4-276 3-614 2-863 1-473 | —6-961 
10 — 6-489 0-037 | -- 1-259 3-354 | —3-724 3-215; 4-587 4-060 1-174) —5-971 
ll — 5-356 | — 2-183 0-417 4-127 | —2-696 | —0-068 4-121 3-697 0-141 | —5-694 
12 — 4-162) — 4-064 3-104 5-996 | —0-112 | — 2-233 3-291 2-556 1-264 | —5-438 
13 —4-012| —4:-776 4-392 6-115 2-138 | — 2-833 4-471 0-899 3-318 | — 4-640 
14 — 3-934 | —5-013 2-849 4-827 5-045 | —0-616 6-916 | —0-156 3-281 | —3-107 
15 — 4-127 | — 2-301 0-106 2-022 6-462 1-330 6-558 | — 0-180 3-220 | — 1-974 
16 — 4-196 1-655 | — 1-617 0-924 5-901 2-580 5-618 | — 1-094 2-386 0-428 
17 — 3-613 3-703 | — 3-347 0-294 4-314 3-479 3-013 | — 3-168 | —0-959 2-407 
18 — 1-898 4-339 | —4-741 | —0-291 2-218 5-971 1-537 | — 2-551 | —5-115 3-117 
i9 0-265 3:979 | —6-043 | — 2-359 0-102 8-250 | —0-307 | — 0-400 | — 7-475 1-631 
20 0-354 3-744 | —6-336 | —5-022) — 1-461 8-364 | —0-074] —0-840 |] — 7-347 | —2-360 
21 — 1-064 2:373 | — 5-069 | —5-804 | — 2-134 7-431 2-456 | — 1-844] —3-242 | — 4-538 
22 — 2-361 0-948 | — 4-722) —3-628 | —0-434 5-358 3-363 | -- 3-555 2-457 | —4-477 
23 — 4-320 | —0-838 | — 5-761 | —1-116 1-563 3-730 1-554 | — 4-380 5-668 | — 3-478 
24 — 2-989 | — 2-423 — 6-897 1-976 2-737 1-398 ; — 1-335 | —3-517 6-139 | — 2-057 
25 — 0-856 | — 4-951 | — 6-678 2-985 0-715 0-213 | —3-839 | — 2-019 5-561 | —2-718 
26 0-911 | —5-398 | —6-913 & 712) - 1-037) —0-113 | —5-161 | — 0-933 3-848 | — 1-514 
7 2-411 | —4-845| —8-374| —2-286| — 1-294 0-434 | —6-862 | — 0-400 3-429 | —0-879 
28 1-700 | — 4-445 | —8-390| — 4-965 | — 1-037 0-003 | — 8-566 1-630 2-081 | — 1-167 
29 —0-414| —2-461) —6-417| —4-791 0-018 | —1-091 | — 7-657 4-255 | —0-158 | —0-730 
30 — 1-623 0-792 | — 4-902 | —3-273 1-089 | — 1-006 | — 2-456 5-218 | —0-350 | — 1-322 
31 — 2-176 3-598 | —2-896 | —3-626 1-905 0-003 3-659 4-312 0-265 | —2-813 
32 0-116 5-347 | — 1-258 | —2-091 1-684 0-977 5-724 3-410 2-320 | —4-394 
33 3-800 5-117 | —0-379 0-532 1-299 1-425 2-965 2-723 4-008 | — 4-966 
34 4-292 4-571) — 1-720 2-890 0-346 2-404 | — 1-364 2-973 4-229 | — 4-377 
35 3-318 4-203 | —3-294 5-830 | —0-133 3-315) — 5-066 0-847 4-178 | —4-613 
36 1-941 6-092 | —2-512 5-986 0-796 3-400 | —7-564| —0-112 3-591 | —3-669 
37 2-030 5-852 | —0-589 5-271 0-898 3-274| —8-153| —0-065 1-837 | —2-151 
38 | 2-737 4-931 0-170 3-794 1-041 2-796 | —8-032 ; 0-380 0-919 | —2-542 
39 | 1-186 3-306 1-126 1-350 1-199 2-417} —8-919 1-172 0-194 | —3-624 
40 | 0-177 2-579 1-248 | — 1-322 1-013 1-967 | — 8-994 1-577 1-219 | —5-332 
41 0-689 0-802 1-262 | — 1-963 0-123 0-637 | — 8-096 1-794 1-330 | — 6-487 
42 0-061 | —0-476 1-719| — 1-944 1-900 0-464 | — 8-932 3-078 1-108 | —7-179 
43 0-162) — 1-104 2-567 | — 1-282 5-481] — 1-135] —9-081 5-260 0-740 | — 7-255 
44 0-634] — 2-942 3°283 | — 1-535 7-134) —0-180| —6-721 5-406 1-831 | —7-061 
45 0-093} — 1-681 2-243) — 1-055 7-504) —0-489| —3-617 2-252 1-411} —5-005 
46 — 1-551 0-324 0-555 | —0-414 6-782} —0-585| —0-272| — 2-345 0-486 | — 1-992 
47 — 2-869 2-455 | —2-419 2-462 5-633 | — 1-148 1-942 | —4-823| — 0-656 1-252 
48 — 4-055 3-924) —4-664 5-091} 4-136) —0-803 3-233 | —6-176| —0-369 2-374 
49 —4119 2-699 | — 5-448 5-392 0-184] —0-403 2-466 | — 4-452 0-100 2-935 
50 — 3-651} —0-113 | ~ 5-168 3:288 | —4-711} — 1-203 0-307 | — 1-755 0-250 5-547 
L 

















282 


Tables oj autoregressive series 


Table 13. Series 11 



































The 501st and 502nd terms are 0- 

















501 and 0-445 respectively. 














7 | | | 
No. of} | ? | Pe 
term | 0+ 50+ 100+ 150+ 200+ 250+ | 300+ 350 + 400+ 450+ 
| | 
| 
1 | —1:301) 1-170) 0-612 | — 1-339} —2-257| 2-498] 1-177| —0-871| 0-363} —0-076 
2 | —1:335) 0-075| —0-439| —1-221| —1-741| 2-166) —0-081| —1-167| 0-620) 0-114 
3 | —1-576| —0-486 | — 0-365 | —0-445| —0-616| 0-982| —0-848/ —1-905| 0-562 | — 0-389 
4 | —1122) — 1-443) —0-782 | —0-673| 0-852| 0-843} —2-310| —2-071| 0-961 | — 0-221 
5 0-162/ 1-165/ —0-081| —0-640| —1-186| 1-785| —0-003| 0-002) —0-628| — 1-452 
6 | — 0-452) 0-808) —1-575/ 0-642 | —1-173| 1-081] 0-674] —1-759| 0-157| 0-848 
7 | —1-075| —0-228| 0-026! —0-339| —0-415| 0092] 0-801| — 0-252) —0-138| 1-904 | 
8 | —1-030| —0-816| —1-066} 1-731| 2-073} 0-293| 0-840| 0-112! 0-438! 0-860) 
9 0-178 | —2-128/ 0-996) —0-323| 0-656) 2-189) 0-029! 1-907! 1-097, 0-145) 
10 0-231) —2-151| —0-597| 1-207) 1-076} 1-534] —0-001/ 0-320) 0-839) 0-763 
11 | —0-580| —0-541 ~ 0-675 | 0-795 | —0-571| 1-171 | —0-086 ~ 1-163 | —0-418| 0-706 
12 | —2-478/ 1-039; 1-200| —1-024| 2-018] 0-508) —1-190| 1-378! 0-978) —0-547 
13 1-091} —1-203| 0-101 —1-198| 0-880] 0-346] —0-452| 2-367| —0-615| — 0-550 
14 2-273 | —0-468| 0-470 | — 1-520 | — 0-293 | — 1-557 | —0-444| 0-523 | — 1-122 | — 0-300 | 
15 2-614) —0-012/ 1-183) —1-053| —0-092| 0-285 | —0-748| —0-309| —2-996| 0-079 | 
16 2-545 | — 1-234 | —0-604! 1-019) —0-271} 1-342| —0-100| —0-759| —2-796 | — 0-427 | 
17 2-644) 0-307 | —0-125) 0-336/ 0-069] 1-662 | —0-122| —0-571| —0-796 | — 0-293 | 
18 0-085 | —0-513 | —0-464| 1-967} —0-784| 2-921] 0-255] —0-748| —0-130| 1-119 | 
19 | — 1-679 | —0-283 | —0-774| — 1-686 | —0-532| 2-086| —0-215| 0-012! — 1-669) —0-221 | 
20 | —1-041 0-141 | — 1-469 | —0-291 | — 1-862! .2-610 | —0-355| —0-715 — 0-956 | 0-345 | 
| | 
21 | —1-369| 0-318) —1-032| 1-022) —2-056| 2-528] 1-874] —0-122/ —1-522| —0-904 
22 | — 1927) 1-169) —0-478| 1-817) —0-931| 2-307 | — 0-111 | —0-551 | — 0-367 | — 0-002 
23 | —1-462| 0-416) —2-042| 2-689/ —1-033| 1-079] —1-866| —0-817| 0-498! —0-482 
24 | —1-5623| —0-315| —0-420| 1-226| 1-356] —0-187| —0-707| —0-005| 2-575 0-672 | 
25 | — 0-286) 0-372) 1-586/ 1-083| 0-838] 0-323 --0-426| 0-737| 0-999 | — 0-076 | 
26 0-343 | — 0-913 | — 0-867 | —0-299| 1-005} 1-073) 1-126] —0-434; 2-012) 0-141 | 
27 | —2-558/ 1-618} 0-467| —0-412| 0-847| 0-085; 0-587| 1-337) —0-617| — 0-264 | 
28 0-731| 0-742} —0-360| 1-914| —0-706| 0-225| 1-278| —0-089| 0-130 | 1-940 | 
29 | —0-630| 0-217) 2-127 1-122} —0-410| 0-926; 0-386) —0-837 1-985} 0-503 | 
30 0-857| 0-979; 0-979) —1-742| —0-393| 1-235| —0-516| 0-720| 0-764) —0-580| 
31 0-656 | — 0-388 | —0-222| —1-145| —0-593| 0-453/ —1-360| —0-809/ —0-500) — 2-443 | 
32 0-062 | —1-779| —1-313| 0-388} 0-206) 0-904) 1-101] 1-226) 0-161! 0-028) 
33 | -0-304| 0-743| — 0-475! 2-669| 0-808] 2-384| 1-930| 0-356 —0-785| 1-370 
34 | — 0-244) —0-113| —0-786| 1-474] 1-058] 0-791] 0-486] 0-785! 0-511) 2-055] 
35 1-079 | — 1-830) 0-682 | —0-345| —0-703| 1-303} 0-209] —i-835| 0-778) 0-300) 
36 1-140) —2-154| 0-730} —1-160} 0-161] 0-368| 1-173| —0-148) —0-235 0-360 | 
37 | —0-457 | — 1-251 1-439 | -0-987| 0-434) —0-455| 1-895! 0-250) — 0-561 | — 1-062 | 
38 | —0-554/ 0-220) 0-727| —0-968| 0-638| —0-425| 1-273| —0-935! —0-141| —0-498 | 
39 0-698 | 0-369/ 0-191) —2-854/ 0-614) —2-542| 0-729| 0-602| —0-765! 0-593 
40 1-176 | +r — 1-452 | — 0-357 | —0-276 | —1-284|) —0-593| 0-796) —1-177| —0-211 
| 
41 0-923 | —0-925 | — 1-934 Lael —0-619| —2-451| 0-803} 0-428) —0-863 | aia 
42 | —1-984| 0-704/ —0-920| —1-678| —1-858| —1-428| 0-333} —0-296 0-010 | —0-703 | 
43 | —1-603| 0-205| 0-858| —1-335| —0-811| 0-169! 0-523 0-092 | —0-620 | —0-171 | 
44 | — 2393) —1-136| 0-385] 0-588| —0-414| 1-360} 0-492) 0-615; 0-385/ 0-816! 
45 | —2-101| —0-915| —0-658| 0-298) —0-257| 2-134] 0-701! —0-293| —0-625| 0-907 
46 | —1:377| —0-260| 0-485| —0-525| 1-004| —0-035 —0-184| —1-405; 1-051) 0-758 
47 | —0-525| 0-373) —0-035| —0-599| 2-024! —0-553 —1-744| —2-676 1-411 1-138 
48 | —0-986) 0-776| 1-162) —0-978| 2-732) 1-849 —0-358| —0-222' 1-312 0-329 
49 | —0-619| 0-925) —0-216! —0-110| 2-918] 1-829| —1-500| —0-286| 0-282) —0-563 
50 | —1-290| - 0-811 | anand —1-070; 3-102} 1-091 | —2-366| —0-778| —0-749| 1-225 





~—~eno 


— ee eS 


~~ wo re 


M. G. KENDALL 











283 
























































Table 14. Series 12 
No. of | | i, | 
poy 50+ | 100+ | 150+ | 200+ | 250+ | 300+ | 350+ | 400+ | 4504 
a | | | | 
Ree | | | | | 
| 1 | —2:394) 0-486 | —0-917 | — 3-592) —4-036| 4-305 | —0-366| —3-727| 2-109 | — 1-211 
| 2 | —3088| — 1-386 | — 1-734) —2-264/ —1-176| 2-207) —3-694| —4-299/ 2-603 | — 1-004 
3 | — 2038 | —0-603 | — 1-530 | — 1-334) — 0-462) 2-060 | — 3-884 | —2-863/ 1-181 | —1-951 
| 4 | —1-150) 0-838) —2-391| 0-306) —1-093| 2-243| --1-751| —2-759| 0-155 | —0-796 
| 5 | —1-321| 0-995) — 1-839} 0-665 | — 1-386 1-530| 0-817) —1-855| —0-558| 2-004 
| 6 | —1-908 | —0-140| —1-893| 2-309) 1-095) 0-854) 2-614 | — 0-549) —0-254| 3-462 
7 | —1-260| —2-780| —0-167| 1-885] 2-553) 2-364) 2-496| 2-230) 1-097) 3-952 
8 | —0-202 | ~5-139| 0-166 2-126 3-337) 3-707) 1-438) 3-048) 2-173) 3-379 
9 | —0-172| —4-804| —0-409| 2-191) 1-823) 4-067) 0-247) 1-075) 1-424) 2-447 
10 | —2-566| —1 676 | seed 0-323} 2-355| 3-128 | — 1-637 | 1-036) 1-457) 0-455 
11 | —1-646 | —0-644) —1-039| —1-938| 2-559) 1-753| —2-376| 2-970] 0-276 | — 1-273 
12 1-746 | —0-339| 1-280) —3-813| 1-344) -1 192| —2-239| 3-271 | —1-547| — 1-928 
1% 5-357 | —0-063| 2-071) —4-279| 0-107! —1-903| —2-023} 1-805! — 4-836) — 1-405 
14 7-565 | —1-134) 1-034) —1-781 | —0-825| —0-155 | — 1-206 | — 0-409 — 7-342 | — 1-009 
15 8-287 | —0-909 | —0-023| 0-516) —0-892| 2-443) — 0-437) — 1-924) — 6-454 | —0-700 
16 5-418 | —0-946| —1-006) 3-426) —1-353| 5-686) 0-377| —2-659 | —3-559| 0-853 
17 0-138) —O-869| — 1-869) 1-824] - 1-574) 7-119) 0-419) — 1-952, —2-357| 1-068 | 
18 | —3-599| -0-342| —3-022} 0-003) —2-917| 7-598 | — 0-083 | — 1-532) — 1-769] — 1-093 | 
19 | —5:397) 0-376) —3-422) 0-113) —4-478| 7-326) 1-573 | — 0-832) —2-289 | — 0-236 | 
20 | —6-064| 1-754) — 2-731) 1-940) —4-398| 6-567] 1-661 | — 0-700 | — 2-001 — 0-808 | 
| | 
21 | —5-434| 2-157 | — 3-335] 4-766| —3-632| 4-640) —0-825| — 1-171 | —0-558 | — 1-253 
22 | —4-468| 1-181) —2-723| 5-499] —0-440) 1-632) —2-446| —0-943| 2-961 | — 0-302 
23 | 2-484) 0-592) 0-258] 4-749] 2-170) —0-201| —2-703] 0-285) 4-536] 0-218 
24 | —0-156| —9-852| 0-779) 2-175! 3-612) 0-036) —0-625] 0-351) 5-521] 0-532 
25 | —1-487) 0-385] 1-194) —0-394|) 3-735) 0-225] 1-251) L581) 3188] 0-212 
26 | —0-827| 1-591] 0-565) 0-393; 1-597) 0-454) 2-967 | 1:474| 0-876] 1-907 
27 | —0-796| 1-775] 2-151) 1-752] —0-521| 1-313] 3-024] —0-006! 1-355] 2-495 
28 | 0-395| 2-136] 3-063) —0-012| —1-765| 2-453] 1-327) —0-023) 1-816) 1-21] 
29 1-488] 1-074] 2-072) —2-034| —2-274|] 2-494] —1-412| —0-832/ 0-821 | —2-359 
30 | 1-502 | — 1-666 | —0-566 | —1-843| — 1-413] 2-421] — 1-116 0-323! 0-155 | —3-172 
31 | 0-604] — 1-626} — 2-133 1-658 0-391 3-800 1-409 1-127) — 1-024 | —0-940 
32 | -0-331| —1-069| —2-850| 4-220! 2-194) 3-761] 2-593) 1-863) —0-693| 2-697 | 
33 | 0-413] —2-:193] —1-386| 3-468] 1-515] 3-540] 2-358 | —0-349| 0-527] 3-638 
34 | 1-760) —4-032| 0-630) 0-545] 0-731) 2-381] 2-470) —1-463| 0-692] 3-058 
35 1-272 | — 4-589 -825| —2-122| 0-480} 0-395] 3-433] —1-185| —0-064| 0-483 
36 | —0-034| —2-813 | -3-574| O-801| —1-182] 3-814] —1-507| —0-557 | — 1-496 
37 0-024 | —0-430] 2-650) —5-725| 1-255] —4-039} 3-208 | — 0-463) — 1-346 | — 1-294} 
38 1-220| —0-295| —0-297| —4-867| 0-704/ —5-136| 1-029] 1-040| —2-379 | — 0-886 | 
39 2-253 | — 1-034 | —3-586 | —2-975] —0-472} —6-081| 0-331] 1-804) —2-807 | — 0-337 | 
40) | —O-116| —0-286 | —4-716 | —2-516| —2-729| —5-549] O82) 1-168] — 1-888 | —0-631 | 
41 | —2-857| 0-407) —2-537]) —2-616 | —3-577 | —2-895 0-558 | 0-475 | — 1-294 | —0-695 | 
42 | —5-478| —0-545| —0-047| — 1-031} —2-984) 0-951] 1-015) 0-554) —0-094) 0-366) 
43 | —6-698| —1-718| 0-558) 0-472) —1-751| 4627) 1-538) 0-078] —0-081| 1-657 
44 | —6-006| —1-877| 1:123| 0-509] 0-570) 4-579) 1-001) —1-596/ 1-008) 2-398 | 
45 | —3-783| —0-833| 0-921} —0-275| 3-526) 2-171] —1-412| —4470) 2-561) 2-947) 
46 | —2:144| 0-798} 1-614) —1-535| 6326) 1-947] —2-412| —4-342| 3-625) 2-372) 
47 | —1-086| 2-220) 1-099) —1-661| 8-114/ 2-886) —3-447| —2-827| 2-989] 0-573 | 
48 | —1-413| 1-232] —2-004| —2-130/ -8-864| 3-292) —4-952| —1-717| 0-726) 0-669 
49 0-159| 0-857/ —4-092| —3-769| 8-191| 3-355) —4-594| —0-112| —0-772| 0-950 
50 0-956 —0-112| —4:721 — 4-822 | 6-745) 1-964) —3-745|) 1-355) —1-098| 1-156 
| | i | j 



























Tables of wutoregressive series 












































































































Table 15. Series 13 
No. of 
cat #+ 50+ 100+ | 150+ | 200+ | 250+ | 300+ | 350+ | 400+ | 4504 
1 0-032 | — 1-048 | —0-765| 0-828] 0-383] —0-654| 0-359] —0-690| 1-494] —0-145 
2 | —2-155| —1-443| 0-566) 0-326| 1-805] 0-163) —0-179| 0-798] —1-191| —1-625 
3 | —3-152| 0-399) 0-072} —1-922| 1-010] 0-694] —0-132] 0-722] —0-488] 0-857 
4 0-009| 0-484) —0-942/) —2-452| 1-934| 0-945] —0-240| 2-344] —0-312] —0-305 
5 | —0-426| —0-664/| —0-018| —0-442| 1-594] —1-615| —0-630| 1-154! 0-981] —0-026 
6 0-773 | —0-662| 0-176| —0-092! 1-793| —0-217| —0-232| —0-847| 0-243! 0-539 
7 | —1-167| —0-506 | —0-565| —0-989| 1-959) —1-432| —0-834| —0-248| 0-343| —0-415 
8 0-108 | —2-637| 0-031/ —0-611| 2-169] —0-575| —1-388| —1-205] —0-117| 0-715 
9 1-013 | —0-423| 0-868/ 0-106} 0-930] 0-078] —2-041| —0-805| 0-244! 1-585 
10 0-886 | —0-052) 0-559| 1-231 | —0-621| —1-060|} 0-075| 0-476| 0-954] 1-275 
11 | —0-553; 0-052} 0-384| —0-925| 0-610| —0-512| —0-228| —1-595| 0-899| —0-924 
12 | — 0-302} —0-161| 0-984] 0-176] 0-111] 1-145] —1-453| —0-830] 0-854] —0-346 
13 | —0-760 | —0-793 | — 0-465 | —0-635| —1-139| —0-693| 0-018] —0-791| 0-076] — 1-822 
i4 | —1-115; 1-006} 1-961} 1-113] 1-894| —0-199| 0-054| 0-125] —1-085| —0-429 
15 | —0-294| 1-312} 6-944| —1-338| 1-105] 0-882] —0-716| —0-816| 1-577] —0-204 
16 | —O0-866) 1-713) 0-547| —2-081 | —1-012| —0-047] 1-389] —0-201] 0-142] 0-076 
17 | —0-393| —0-762| 1-554| —1-092| 0-290| 0-088! —0-748| —0-657| 0-081 | —0-097 
Is 0-038 | —0-794| 0-856] 0-678} 0-18] 1-385| 0-240} —0-973] —0-828 0-653 | 
19 0-501 1-331] 0-827) —0-471| 0-455| —0-073|} —0-334| —1-562| 0-137] —1-724 
20 | —0-834| —0-877| 0-771] 0-465] 1-421| —0-526} 0-052| —0-360] —0-419| —0-818 
21 0-944 | — 0-785 | —0-385| —1-505| 2-358| 1-473] 1-767] —1-676| 1-563] —0-372 
22 0-350 | — 1-190) —0-714| —1-078| 0-702] —0-168| 0-056| —1-126| 1-473] — 1-906 
2: 0-153} 0-537) —2-336| 0-103) 2-639) —0-960| —0-934| 0-916] —0-426| —0-521 
24 | —0-385| —0-330/ 0-564] 0-916] 1-869] —1-885| —0-555| —0-788| — 1-020 — 0-440 | 
25 | —0-494| 0-183] 0-544) 0-469! 0-746! —2-311 1-229} —0-681} 1-143| 1-015) 
26 0-228 | — 2-011 | —0-267| 2-254] 0-311] —2-151| 2-391] 0-170) —2-405| —0-860 
27 0-557 | — 1-006] —0-476| 1-179] 0-236] —1-838| —0-853| —0-081| 0-090] —0-482 
28 | —1-857| — 2-128] —0-868| 0-988| —0-157| —0-173| —0-739|] 0-179] 1-221| 0-593 
29 0-771 | — 0-204 | — 2-337 | —0-846| —1-691| —0-668| 0-801 1:588| 0-378] 1-226 
30 0-173) —0-158| —0-135| 0-701] 0-049] 0-830; 0-690! —0-475| —0-564| —0-245 
31 1-272/ 0-246) 0-005] 1-607} 0-448] 0-415| 2-299) —0-809] —0-576| 0-871 
32 0-539] 0-285 | — 1-266} —0-275| — 1-719] —0-306| 1-615] —0-253] —1-412| 2-606 
33 1-108} 1-060} 0-863} —0-396| — 2-763] —0-389! 0-053] 1-141] 0-154] 0-043 
34 | —0-860) 0-720) 0-249] — 0-032] —0-371| 0-391| —0-730| —0-202|] 0-624| 0-071 
35 | —0-733) 0-112} 1-312} 0-321] 0-277] —0-967| 0-228! 1-439! 1-448! 0-896 
36 1-139 | --0-052 | —0-836] 1-523] —0-547| — 0-429} —0-419| 0-451 1-256 | —0-023 
37 0-630 1-460 0-497 | —0-522| —0-638 1-235 0-028 2-862 0-941 | — 1-338 
38 | -—0-775| 0-421 1-012} 0-692] —1-001| —0-450' 0-096! —0-447] 0-345/ 0-198 
39 | —0-736| — 1-425] 0-062} 0-989] —0-094| 0-269| —0-678| —1-514| 0-974! 1-789 
40 | —0-560| —1-781| —0-463] —0-512| 0-960 0-168 | —0-556 | —0-642|} 1-518] —0-870 
41 0-666 | —1-169| 0-823} —0-427| —0-980] 1-129] —1-797| —1-279| 0-345! —1-309 
42 0-088 | — 0-867 | —0-398} 0-033} —0-238| 0-144 — 1-571 | — 2-557 | — 0-789 | — 0-328 
43 0-084 | — 0-408 | — 1-662] 0-214] —0-783| —0-294| 0-393/ — 1-226) —1-358] 0-RI8 
44 0-914 | — 0-334) 0-634 | — 0-945) —1-547| 1-745] 0-030! 0-176) —0-329| —0-075 
45 1-293 | —0-255| — 0-476 | — 0-438} —1-631| 0-160} 0-290| —0-143! 0-504] 1-289 
46 | —1-812| —0-488| 1-440| — 0-973} 0-269) —0-536| 0-035| —1-317| 1-022 | —0-383 
47 | —1-849| —0-454| —0-573| —0-018! 1-665| 0-470| —2-720! —2-487| 0-581} —1-475 
48 0-193; 0-298] 0-080} 0-375) -1-277| —1-915| — 0-544: —1-157! 0-370] —0-770 
49 0-590 | —0-254| 2-936) 0-613) —0-830| 0-327! — 1-109 0-382] — 0-380] —0-573 
50 0-172) —0-583 | —0-554| 1-805} —0-815| 1-005] —0-727 —o-s08| = 0-541} — 1-011 
| } 











The 501st and 502nd terms are — 0-662 and — 0-337 respectively. 





Aacaa 


amo oom ION Ce 








—_ 











M. G. KENDALL 285 
Table 16. Series 14 
T 
No. of | | | 
aan\| 7+ 50+ 100 + 150+ | 200 + 250+ 300 + 350 + 400 + 450+ 
, 
1 — 5-539 | — 1-403 0-053 |. a 1-511; 3-610 0-158 0-581 1-721 0-234 | — 1-300 
2 — 5-006 | — 0-205 | — 0-547 l— 4-781 | 3-981 1-858 | —0-194 4-336 | —0-887 | —0-744 
3 | —3-163| —0-189| —0-646| —4-945! 4-169] 0-350) —1-134| 5-063] —0-112] —0-195 
4 | —0-204| —0-767| —0-261| —3-141| 4-388] —0-761| —1-382| 2-554] 0-564] 0-697 
5 0-191 | — 1-255 | —0-529| —1-972| 4-701 | —2-444| —1-788] 0-030) 1-019] 0-449 
6 0-420 | — 3-634 | —0-421| —1-209| 5-147] —2-883| —2-663| —2-449| 0-722] 0-861 
7 1-379 | —3-793| 0-670 | —0-238| 4-241 | — 1-871 | —4-077| —3-514| 0-529] 2-307 
8 2-193| —2-407| 1-506) 1-574} 1-470 | — 1-677] —3-078| —2-165| 1-175] 3-383 
9 1-170 | —0-699| 1-706] 0-925) 0-107] — 1-421 | —1-575| —2-219| 1-927] 1-643 
10 | —0-112| 0-273| 2-107) 0-407| —0-506| 0-421 | — 1-647] —2-189] 2-386 | — 0-230 
| | 
11 — 1-468 | —0-143 1-000 — 0-650 | — 1-750 0-480 | — 1-006 | — 2-089 1-737 | —2-896 
12 — 2-674 0-713) 2-008) 0-195 0-223 0-119 | —0-229| — 1-079 | —0-367 | —3-493 
13 — 2-501 2-167} 2-652) —0-799 2-225 0-773 | —0-465 | —0-958 0-305 | — 2-598 
14 | —2-281 3-741 | 2-461 | — 3-057 1-324 0-744 0-992 | —0-715 0-661 | — 1-036 
15 i 1-651 2-269 | 2-935 | — 4-055 0-634 0-520 0-576 | —0-965 0-656 0-063 
16 | —0-638 | —0-168 | 2-854 | — 2-254 | 0-216 1-585 0-377 | — 1-677 | —0-437 1-240 
7 0-625 0-011 | 2-499 | — 0-923 | 0-376 1-411 | —0-207 | —2-924| —0-672 | —0-391 
Is 0-172 | —0-780 | 2-093 0-577 | 1-726 0-233 | — 0-364 | — 2-738 | —0-939 | — 1-869 
19 0-821} —1-649| 0-668) —0-409| 4-069 1-024 1-470 | —3-226| 0-866 | —2-232 
20 1-167 | —2-614| — 1-026) —1-816 4-315 0-842 1-855 | —3-305 2-895 | — 2-727 
21 1-026 | —1-514 — 3-798 | — 1-690 5-351 | —0-546 0-372 | — 1-107 2-326 | — 2-404 
22 | 0O-160| —0-688 ie 3-101 | —0-035 5-597 | — 2-906 | — 1-074 | —0-353 0-091 | — 1-722 
23 | —0-831| 0-183] —0-968| 1-275] 4-228] — 5-235] —0-138| —0-516| 0-080} 0-324 
24 | —0-766|) —1-466| 0-219 3-674 2-163 | — 6-456 2-776 | —0-221 | — 2-362 0-357 
25 0-130 | —2-710 | 0-249 4-583 0-501 | — 6-323 2-270 | —0-066 | — 2-549 | —0-251 
26 | —1-331 | —4-376| —0-704 4-192 | —0-687 | —3-900 0-370 0-217 | —0-401 0-138 
27 — 0-758 | — 3-663 Be 3-236 1-474 | — 2-697 | — 1-796 0-073 1-859 1-211 1-504 
28 0-005 | — 1-999} —3-342! 0-226] —2-575] 0-804] 0-585 1-462] 0-969 1-340 
29 1-656 | —0-122 | — 2-054 | 1-119 | —1-035 2-1$7 2-906 | —0-!31 | —O-LI6 1-593 | 
.30 2-359 1-151 | — 1-854 0-843 | — 1-571 1-709 4-519 | —1-128 | — 2-024 3-689 
31 2-874 2-387 | —0-156| —0-029 | —3-973 0-392 3-571 | —0-034| —2-014 3-304 
32 1-122 2-770 1-011 | —0-485 | —3-956 | — 0-032 0-939 0-324 | — 0-580 1-861 
33 — 0-936 1-966 2-499 | —0-198 | —2-088 | — 1-198 | —0-525 1-813 1-817 1-29) 
34 — 0-451 0-725 1-408 1-548 | —0-866 | — 1-731 | — 1-466 2-283 3-545 0-467 
35 0-601 1-275 0-796 1-279 | —0-547 | — 0-070 | — 1-322 4-467 3-932 | — 1-470 
36 0-112 1-461 1-183 1-326 ; — 1-169 0-339 | —0-625 3-325 2-898 | — 1-653 
37 — 0-913.) —0-456 0-966 1-807 | — 1-107 0-676 | —0-705 | —0-090 2-195 0-705 
38 — 1-621 | —3-013 0-008 0-813 0-327 0-743 | — 1-019 | — 1-803 2-484 0-733 
39 | —0-660] —4-255] 0-349] —0-436 | — 0-067 1-608 | — 2-565] —3-218 1-980 | —U-856 
40 0-172 | —4-041 | —0-018 | —0-853 | —0-475 1-541 | —3-883 | —5-195 0-147 | — 1-636 
41 0-604 | — 2-726 | — 1-856 | —0-507 | — 1-272 0-598 | — 2-596 | —5-332 | —2-187 | —0-3854 
42 1-492 | — 1-312} —1-399| — 1-076] — 2-709 1-632 | —0-944 | —3-091 | —2-808 | —0-196 
43 2-632 | —0-335 | — 1-087 | — 1-368 | —3-975 1-656 0-550 | —0-878| — 1-491 1-500 
44 0-338 | —0-201 0-944 | — 1-940} — 2-749 0-470 1-112 | —0-737 0-786 1-365 
45 — 2-794 | —0-507 1-009 | — 1-468 0-629 0-159 | — 1-772 | — 2-859 2-191 | —0-723 
46 — 3-049 | —0-160 0-718 | —0-270 0-789 | — 1-975 | —3-349 | —3-933 2-387 | —2-248 
47 — 1-367 | —0-176 3-221 1-050 | — 0-277 | — 1-925 | —3-907 | —3-279 1-150 | — 2-684 
48 0-193 | — 0-697 2-630 3-095 | —1-514| —0-125 | —3-360| —0-833 0-613 | — 2-840 
49 —0-152|} — 1-443 2-111 3-263 | —2-181 1-184 | — 2-433 2-218 | —0-046 | — 2-444 
50 — 1-707 = 0-674 1-333 3-846 | —1-479| | 1-186 | —0-198 1-665 | — 1-982 | — 1-605 
as 



























































286 


Tables of autoregressive series 

































































Table 17. Series 15 
No. of 0 " | 
pveiag — 50+ 100+ 150+ 200+ 250+ 300 + 350+ 400 + 450 + 
| 
1 — 0-855 2-075 | — 1-762 & 0-544 | — 0-486 | —0-089 0-764 | —0-399 | —1-032/| 0-232 
2 1-307 1-113 0-808 0-026) 0-014) —0-015 0-770 0-928 | —0-561 | — 1-831 
3 1-199 0-287 | —0-659 | —0-174| — 0-677 0-251 | — 0-794 1-222 | —3-251! 0-235 
4 1-244 0-661 | — 1-588 | —0-002) 0-416] — 1-055 1-107 | — 1-566 | —0-983 | 0-538 
5 1-259 | — 0-604 | — 0-965 0-011) 0-586 2-122 0-462 | — 2-577 0-736 | — 0-550 
6 0-129 | —0-867 | — 1-428} —0-290! 0-676 0-638 | — 0-679 | — 1-579 0-836| 0-762 
7 — 0-476 | —1-817 | —0-625 1-032 | —0-591 | — 0-403 1-336 0-023 0-803 — 0-953 
8 — 0-810 0-970 | — 1-698 1-143 0-472 | — 0-278 0-236 | —0-173 1-048 | 1-684 | 
i) — 0-306 0-773 2-032 1-236 0-668 | — 1-606 1-207 | —0-118 0-987) 1-821 | 
10 0-173 0-541 | —0-871 0-454 0-049 | — 1-147 0-007 | — 0-817 | — 0-926 | —0-083 | 
| 
| — 0-801 0-494 0-096 0-103) 0-586 0-005 | — 1-216 0-997 | —0-270 — 1-031 
12 | —0-220| —0-342 | — 1-203] —0-047| 0-165 0-281 | — 1-249 0-392 0-427 | —0-963 | 
13 | -- 1-618 | —0-050 0-909 | — 0-632 | 1-116 2-150 — 1-696 | —0-603 | — 6-255 | 2-015 | 
14 — 1-287 | — 0-093 0-176 | —0-976 | 0-188 | — 1-157 | 0-729 | —0-229 | —1-159 | —0-905 | 
15 — 0-032 1-034 1-433 0-606 | — 0-419 | —0-698 | —1-579| — 1-212 2-104! — 1-069 
16 — 0-571 0-246 | — 1-273 1-714); 0-794) —O-511| 1-242 0-506 0-302; 0-114 
17 1-816) —0-193 | —1-877| —0-508 0-574 | —0-512 1-704 | — 1-335 1-022 | —0-522 
18 0-342 0-507 0-229 1-053 0-977 — 0-113 | 0-160 | — 0-352 0-073 — 0-062 
19 |) = (0-620 0-325 | — 0-033 1-193} 0-640) —0-801 | — 1-077 0-851 0-015 | —1-121 
} 20 | 0-406 | — 0-040 0-274 2-321 0-241 | —0-199| — 1-159 0-772 0-202 | 1-217 | 
21 | —1-088 0-311 0-724 0-097 | — 0-602 | — 0-937 | — 1-001 | — 0-080 0-402 | 1-649 | 
22 — 0-166} —0-314| —0-158 | 0-455 0-525 | — 1-104 0-559 0-863 6 183 0-027 | 
23 | — 0-205 | 0-741 | —0-271 | — 1-979 1-219 | 1-027 0-154 0-685 0-603 | — 0-960 
24 | —0-417 2-126 — 0-125 | 0-621 0-059 | —0-287 0-113 | —0-363 | —0-777 — 0-235 
25 — 0-544 | — 0-835 i-604 | 1-357 | —0-012 | —2-073 1-136 1-064 0-967 | — 0-435 
26 0-651 | — 1-895 | —O-117 0-396 | — 0-520 0-181] 0-254 | — 1-096 1-182 | —0-120 
27 0-719 0-037 — 0-485 | — 0-368 1-384 0-460 0-227 | —0-316 | —0-633; 1-201 
28 — 1-160] —0-068 | —)-549 — 1-309 1-984 0-371 0-712 | —1-149| —1-587| 0-145 
29 0-396 1-332 | —0-237| —0-290| — 1-423] —0-399| — 0-680 0-313 | —1-123 | —0-249 
30 0-394 0-157 | —0-554| —0-102} — 1-370 0-395 | —0-166 | —0-734 0-449 0-608 
31 0-460 | —0-181 0-569 0-404 | — 0-077 | — 0-939 | — 0-787 1-414 0-564 | —0-791 
32 0-341 | —0-271 1-470 1-916 1-032 0-127 | —0-385 | — 1-084 0-676 2-151 
3: — 0-549 | — 0-736 1-052 | — 0-687 1-015 0-736 0-727 | — 1-149 | —0-275 | —0-644 
34 — 0-536 | — 1-176 0-127 | — 2-022 1-232 | —0-244 1-055 | — 0-930 1-308 1-054 
35 =| — 1-602 1-663 1-389 | —0-914] —0-808 | —0-778 | —0-370 0-557 | — 0-578 1-848 
36 | — 0-042 1-545 0-098 | — 0-759 0-644 | —0-283 1-277 0-179 | —0-934 1-185 | 
37 | —0-013 | —0-325 0-356 | — 0-565 1-860 0-128 0-190 1-018 | —1-075 | — 1-822 
38 — 0-051 | —0-067 | — 1-255 | — 1-378 1-998 0-069 | — 0-472 | —0-861 | —0-753 | —0-462 
39 1-128 0-595 0-303 0-052 | —0-169 | — 2-273 0-460 | — 2-163 0-319 — 0-048 
40 1-635 0-505 1-444 1-183 1-312 | —0-731 0-783 0-756 | —0-482 | 0-163 
41 — 1-557 | — 0-639 2-046 1-848 0-656 | — 1-906 | —0-705 | — 0-647} —1-108 | —0-328 
42 1-396 6-413 1-345 0-360 1-394) —0-713 1-226; 0-732) — 1-633 0-104 
43 1-783 | — 0-640 1-117 0-406 0-247 | — 0-605 1-922 1-447 | —0-694/| — 1-090 
44 1-125} —1-148| —1-120) —1-401 | 0-704 0-142 0-211 1-270} —0-041| 0-869 
45 1-029 0-518 | —0-004 = 0-998 | — 0-223 0-425 0-245 | —0-055| — 0-029 0-518 
46 0-002 | —0-148 0-528 | — 0-699 1-367 | —2-728| — 1-504 0-964 | —0-620| —0-914 
47 0-352 | —0-109| 1-131} —0-813| —0-926| — 1-953 0-369 | —0-373| —0-169 0-054 
48 1-398} —1-018| 0-945] —0-286] —2-743| —0-218| — 0-853 | — 1-207] —0-007 | — 1-359 
49 0-329 | — 0-656 | 0-850 | — 0-084 0-452 0-849 0-308 0-902 0-456 | — 0-005 
50 — 0-502 0-018) 0-285 0-665 | — 0-326 0-167 1-359 | — 0-441 0-583 — 1-006 

















The 501st and 502nd terms are 0-663 and 0-452 respectively. 





~~ 








“yy 





























=e ] 
No. of 
term O+ 
1 3-064 
2 3-961 
3 4-084 
4 2-641 
5 0-387 
6 — 1-705 
ie ee 
| .@ | ~1-587 
| g | —1-359 
10 | —0-922 
11 | —1-952 
12 | —2-974 
13. | —2-327 
14 | —1-644 
15 1-171 
16 2-452 
17 2-732 
18 2-185 | 
19 — 0-051 | 
20 | —1-314| 
2] ieee 
22 | —1-548 
23 | —1-434 
| 94 | —0-152 
| 25 1-268 
| 26 0-311 
27 0-104 
| 98 0-353 
29 0-796 
30 1-040 
| 31 0-197 
| 32 | —0-839 
| 33 | —2-624 
| 34 | —2-509 
35 | — 1-461 
| 36 | -0-403 
37 1-415 
| 38 3-393 
39 1-468 
40 1-314 
41 2-495 | 
42 | 3-212 
43 | 3315} 
44 2-043 
45 0-941 | 
| 46 1-412 | 
47 i-412 
48 0-345 
49 | 1-749 
50 2-864 | 





2-563 
2-048 
0-368 
— 1-487 
— 3-636 
— 2-287 
0-076 
1-768 
2-401 
1-415 


0-306 
—.0-464 
0-371 
0-886 
0-596 
0-720 
0-819 
0-501 
0-452 
— 0-067 


0-441 
2-645 
1-854 
1-178 
2-186 
1-884 
0-353 
1-487 
1-278 
0-392 


“944 
-411 
"517 
-182 
-334 
1-409 
0-978 
0-876 


8| —0-164 


— 0-206 


— 0-784 
— 1-908 
— 1-189 
— 0-502 
— 0-066 
— 0-840 


| — 1-547 
: — 1-264 
9, —2-379 


—1-177 









































M. G. KENDALL 287 
Table 18. Series 16 
100+ | 150+ | 200+ | 250+ | 300+ | 350+ | 400+ 450+ 
=e -. 

gore ~0-950| —0-510| 0-782] 1-212] 2-643] —4-404| —1-514 
— 1-840] —0-842| —0-417| —0-229| 1-060] 0-389] —4-962| —0-623 
—2-607| —0-439| 0-382] 1-479] 1-022] —3-470] —2-520| —0-478 
—3-376| —0-353| 1-305] 2-379] —0-085| —5-591] 0-545| 0-548 
—3-035| 0-864| 0-654! 1-475] 0-732] —4-392] 2-663! -0-112 
~3-348| 2-270} 0-538; 0-155] 1-083] —2-209] 3-704] 1-287 
~0-134| 3-301] 0-933] —2-173| 2-033] —0-352] 3-730] 3-293 
0-656| 2-950| 0-807) —3-615| 1-701] —0-099| 1-325] 2-896 
0-885] 1-698] 1-007) —2-885| —0-361| 1-063] —0-677| 0-508 
—0-558| 0-345} 0-869) —1-085| —2-497| 1-611] —0-981 | — 1-852 
—0-147| —1-101| 1-569] 2-399] —4-262| 0-638] —0-995 | —0-276 
0-293 | —2:360| 1-479| 2-024] —2-711] —0-333) —1-763 | —0-283 
1-829| —1-439, 0-424! 0-329] —2-430| —1-897| 0-662] — 1-242 
0-592) L311) 0-521| —1-161| —0-075| —1-415| 1-912] —1-111 
—2-140| 1-653) 0-935| —1-954] 2-836] —1-942|] 2-794] —1-123 
—2-491| 2-216) 1-745) —1-682] 3-317| —1-781| 2-191 | —0-742 
~1-626|) 2-804) 2-092) —1-674] 1-154] —0-137| 1-028| —1-375 
—0-304| 4-298] 1-670) —1-200| —1-548| 1-512] 0-237] 0-075 
1-202| 3-422| 0-189) —1-420] —3-281| 1-651] 0-149] 2-419 
1:317| 2-071 | —0-102| —2-066] —2-276| 1-924] 0-228| 2-651 
0-576 | —1-412| 1-012] —0-536] —0-709} 1-975] 0-780| 0-746 
—0-150| —1-968| 1-224] 0-157] 0-471] 0-848] —0-034| —0-740 
1-152| —0-102| 0-828! —1-633| 2-009] 1-009] 0-540] — 1-622 
1-224] 1-268! —0-221| —1-693|. 2-228] —0-410| 1-793] ~ 1-534 
0-286| 1-078] 0-727] —0-586| 1-674] —1-272| 1-069] 0-325 
— 0-847 | — 0-758 2:894| 0-573| 1-440) —2-343| —1-307| 1-269 
—1-311| —1-662.| 1-397] 0-524] 0-066] —1-628| —3-096| 0-985 
— 1-573 | — 1-552 ~1-280| 0-685| —0-813] —1-354| —2-303] 1-057 
—0-506 | —0-472| —2-184| —0-447| —1-714| 0-739] —0-421 | —0-121 
1-700 | 2-173 | —0-730| —0-708| —1-864| 0-406] 1-364] 1-489 

| 
| 3:175| 1-939] 1-304) 0-181 | — 0-467] —1-072| 1-436] 1-055 
| 2770 | —0-975 3-031| 0-309! 1-474) —2-312| 2-206] 1-470 
| 2-848! —2-957| 1-874] —0-529| 1-485] —1-450| 1-130] 2-937 
1-846 | —3-524| 1-190) —1-019| 2-173] —0-260| —0-794| 3-681 
0-963 | —2-963| 2-232) —0-729| 1-838] 1-457) —2-513| 0-759 
— 1-119 | — 2-875 | 3-858 | —0-223| 0-463| 0-872| —3-121| — 1-468 
—1-409| —1-629| 2-959| —2-154! 0-051} — 1-933] — 1-857 | — 2-042 
0-453) 0-828) 2-638| —2-989] 0-607) —1-806, —0-964| — 1-349 
3-249} 3-574| 2-078] —4-117| —0-063| —1-667| —1-240| —0-791 
4-693 | 3-877| 2-361] —3-747| 0-854] —0-199| — 2-515 | —0-092 
ane! 2-884) 1-805] —2-668| 2-892! 2-062) —2-841| —0-795 
1-653 | —0-167| 1-509| —0-920| 2-966; 3-637| —1-908| 0-040 
| —0-513| — 2-624) 0-535) 0-748| 2-061) 2-915 —0-707| 0-960 
| ~0-862| —3-502! 1-200] —1-446| —0-720' 2-352] —0-444| 0-122 
0-439 | — 3-353, 0-127| —3-917| — 1-453) 0-757 | — 0-304 | —0-292 
1-859 | ~ 2.293 | —3-203 | — 3-804 | — 2-092 —1-551| — 0-219 | — 1-741 
2-675 | —0-853 | —3-135| —1-377| —1-266| —1-182| 0-477] — 1-774 
2:299| 0-838| —2173| 0-555) 1-012) —0-966| 1-167) —2-087 
0-647} 0-863| —0-912| 2-062] 1-347| —1-504| 1-277 | —0-746 
“He ose 0-069| 2-761| 1-904) —1-732| —1-009| 0-675 

| 
i 








Biometrika 36 


T9 








288 


Tables of autoregressive series 


Table 19. Values of c, and r, for series 1 




































































k Cy T, k Cy Tr k Cy Ty 
0 1,217,176 1-000 17 76,902 0-066 34 54,121 0-048 
1 927,053 0-763 18 37,770 0-032 35 9,302 0-008 
2 458,107 0-378 19 — 8,169 — 0-007 36 — 41,850 — 0-037 
3 96,479 0-080 20 — 29,750 — 0-026 37 — 52,745 — 0-047 
4 — 80,933 — 0-067 21 — 6,035 — 0-005 38 — 26,534 — 0-024 
5 — 95,200 — 0-079 22 22,188 0-019 39 — 19,038 — 0-017 
6 — 47,312 — 0-039 23 12,361 0-011 40 — 33,765 — 0-030 
7 — 8,895 — 0-007 24 — 12,771 —0-011 41 — 43,076 — 0-039 
8 26,685 0-022 25 — 47,794 — 0-041 42 — 13,075 —0-012 
9 21,158 0-018 26 — 91,940 — 0-080 43 37,682 0-034 
10 — 42,728 — 0-036 27 — 125,573 —0-109 44 61,317 0-055 
ll — 123,352 — 0-104 28 — 146,002 — 0-127 45 53,467 0-048 
12 — 173,350 — 0-146 29 — 134,177 —0-117 46 44,736 0-041 
13 — 153,577 — 0-130 30 — 86,031 — 0-075 47 45,566 0-041 
14 — 62,428 — 0-053 31 — 20,509 —0-018 48 38,801 0-035 
15 34,670 0-029 32 44,838 0-039 49 24,435 0-022 
16 82,284 0-070 33 77,980 0-069 50 7,831 0-007 
Table 20. Values of c, and r,, for series 2, 3 and 4 
Series 2 Series 3 Series 4 
k Cy Ty Ce Tp CE T. 
0 835,517 1-000 939,563 1-000 480,312 1-000 
1 709,079 0-852 580,821 0-621 — 327,751 — 0-685 
2 505,325 0-610 — 97,646 — 0-105 83,299 0-175 
3 291,381 0-353 — 563,764 — 0-608 124,239 0-262 
4 104,876 0-128 — 549,785 — 0-595 — 195,590 — 0-414 
5 — 36,275 — 0-044 — 164,843 — 0-179 142,973 0-304 
6 — 102,830 — 0-126 235,473 0-257 — 59,201 — 0-126 
7 — 121,945 — 0-150 362,975 0-398 — 3,399 — 0-007 
8 — 114,325 ~ 0-142 221,837 0-244 7,358 0-016 
9 — 76,997 — (096 — 10,213 —0-011 29,175 0-063 
10 — 46,795 — 0-058 — 178,603 — 0-198 — 54,608 —0-119 
1] — 26,232 | —90-033 — 200,113 — 0-223 48,217 0-105 
12 — 11,782 —O-O15 — 119,942 — 0-134 — 16,371 — 0-036 
13 — 7,589 — 0-010 — 8,147 — 0-009 — 8,168 — 0-018 
14 — 32,356 — 0-041 111,156 0-126 — 2,633 — 0-006 
15 — 61,164 — 0-078 167,156 0-190 | 19,762 0-044 
16 — 102,551 — 0-132 140,275 0-160 | — 21,654 — 0-048 
17 — 131,550 — 0-169 48,177 0-055 — 2,216 -- 0-005 
18 — 136,460 — 0-177 — 59,858 — 0-069 30,250 0-068 
19 — 120,194 — 0-156 — 125,095 — 0-145 — 41,995 — 0-095 
20 — 78,730 — 0-103 — 88,301 — 0-103 26,538 0-060 
21 — 23,001 — 0-030 21,297 0-025 6,283 0-014 
22 3° 088 0-050 122,829 0-144 — 28,023 — 0-064 
23 744,008 0-099 157,567 0-185 46,618 0-107 
24 108,169 0-144 81,340 0-096 — 51,382 —0-119 
25 141,980 0-190 — 63,776 — 0-076 22,938 0-053 
26 137,726 0-185 — 171,262 — 0-204 42,239 0-099 
27 99,392 0-134 — 168,798 — 0-202 — 90,303 — 0-212 
28 72,398 0-098 — 66,814 — 0-081 90,937 0-214 
29 71,157 0-097 63,524 0-077 — 33,782 — 0-080 
30 68,511 0-094 147,371 0-179 — 25,722 — 0-061 

















oP WN— > 


~ 
aS Ss 











WrPonmnwnwnnwn ww tw 


















































M. G. KENDALL 289 
Table 21. Values of c, and r, for series 5 and 6 
Serics 5a 5b 5c 5d 5 (in toto) 
Ce 977,837 958,436 879,736 872,844 3,688,853 
Cy 739,993 702,480 607,461 619,552 2,672,905 
t, 0-7568 0-7329 0-6905 0-7098 0-7246 
Ce 364,400 282,372 168,243 223,038 1,043,493 
rT. 0-3727 0-2946 0-1912 0-2555 0-282 
C3 58,786 — 60,840 — 128,204 — 98,570 — 221,255 
Ts; 0-0601 — 0-0635 — 0-1457 —0-1129 — 0-0600 
Cy — 90,575 — 243,083 — 198,042 — 276,361 — 804,161 
T%, — 0-0926 — 0-2536 — 0-2251 — 0-3166 — 0-2180 
Series 6a 6b 6c 6d 6 (in toto) 
Co 1,551,606 1,505,293 1,548,545 1,186,705 5,792,149 
C 924,569 903,559 951,497 750,232 3,531,836 
v; 0-5959 0-6003 0-6144 0-6322 0-6098 
Cy — 270,379 — 220,471 — 194,686 — 48,120 — 720,610 
Ts —0-1743 — 0-1465 — 0-1257 — 0-0405 —(Q-1244 
C3 — 1,021,112 — 927,370 — $76,079 — 604,656 — 3,509,827 
Ts; —0-6581 — 0-6161 — 0-6303 — 0-5095 — 0-6060 
Cy — 872,747 — 793,867 — 901,618 — 645,513 — 3,206,192 
r, — 0-5625 — 00-5274 — 0-5822 — 0-5440 — 0-5535 
| 








Table 22. 


Values of c, and r,, for series 8, 10, 12, 14 and 16 





























Series 8 Series 10 
7 =e ee aaa Se: ee eee! 
Cp rv, Ce vr, 

= Ss ee 
0| 12,406-47 | 1-000 | 6414-470 | 1-000 
1/11,751-63 | 0-949 5707-452 0-892 
2/ 10,182-50 | 0-824 | 4131-158 0-647 
3 8,303-08 | 0-673 | 2407-876 0-378 
4| 6,590-10 | 0-535 1064-645 0-167 
5| 5,299-99 0-432 | 302-223 0-048 
6 4,410-17 0-360 39-600 0-006 
7| 3,865-31 0-316 90-769 0-014 
8| 3,502-73 0-287 274-386. 0-043 
9; 3,211-80 0-264 471-979 0-075 
10} 2,959-40 0-243 624-129 0-099 
1] 2,.767-02 0-228 725-064 0-116 
12) 2,660-60 0-220 771-678 0-123 
13| 2,627-86 0-217 768-202 0-123 
14/ 2-632-93 0-218 682-319 0-109 
15] 2,639-50 0-219 550-040 0-088 
16 2,632-87 0-219 396-900 0-064 
17| 2,570-85 0-215 221-944 0-036 
18| 2,425-49 0-203 67-235 0-011 
19} 2,201-66 0-184 — 40-242 |—0-007 
20 1,924-49 0-162 |— 131-413 !|—6-021 
21) 1,622-42 0-137 |— 216-876 |—0-035 
22 1,300-73 0-110 | — 294-638 |—0-048 
23 957-63 0-081 |—354-742 | —0-058 
24 622-43 0-053 |—416-650 | —0-068 
25 347-31 0-029 |— 465-492 | -—0-076 
26 150-48 0-013 |—430-809 |—0-071 
27 23-52 0-002 |— 272-333 |—0-045 
28 — 63-54 |—0-005 -— 6-663 |—0-001 
29; — 147-03 |—0-013 282-596 0-047 
30| — 241-18 |—0-021 503-641 0-084 








Series 12 





Ck 


3346-111 
2815-257 
1714-249 
650-804 
— 49-417 
— 408-133 
— 548-680 
— 532-595 
— 395-760 
— 188-798 
36-728 
189-907 
231-281 
182-206 
150-430 
175-750 
196-577 
192-876 
190-296 
219-116 
308-266 
416-661 
497-343 
501-373 
453-351 
357-632 
199-598 
— 24-634 
— 221-975 
— 292-376 
— 248-584 





Ve 


1-000 
0-843 
0-514 
0-196 

— 0-015 
— 0-123 
— 0-166 
— 0-161 
— 0-120 
— 0-057 

0-011 
0-058 
0-071 
0-056 
0-046 
0-054 
0-061 
0-060 
0-059 
6-068 
0-096 
0-130 
0-155 
0-157 
0-142 
0-113 
0-063 

— 0-008 
— 0-070 
— 0-093 
— 0-079 








Series 14 Series 16 
E Cy r; 
ae ee 1672-212 | 1-000 
1707: 9380 | 0-809 | 1253-272 | 0-751 
973-957 | 0-462 | 489-683 | 0-294 
358-001 | 0-170 |—133-027 |—0-080 
2-327 | 0-001 |—387-373 |—0-234 
— 132-306 |— 0-063 |—311-683 |—0-188 
— 143-107 |—0-068 | —85-278 |—0-052 
— 143-734 |—0-069 | 125-455 | 0-076 
— 144-603 |—0-069 | 245-460 | 0-149 
—116-614 |—0-056 | 271-914 | 0-166 
—~ 49-845 |—0-024 | 197-670 | 0-121 
13-060 0-006 80-693 0-049 
28-253 | 0-014 | —11-174 | --0-007 
- 4-922 |—0-002 | — 49-764 |—0-031 
—~ $1-316 |—0-030 | —30-803 |—0-019 
—119:775 |—0-058 | —8-137 |—0-005 
— 147-510 |—0-072 | —9-094 | —0-006 
— 148-196 |—0-073 | —26-163 |—0-016 
— 102-048 |—0-050 | —53-165 |—0-033 
— 21-133 |—0-010 | —54-609 | —0-034 
60:353 | 0-030 | —23-027 |—0-014 
141-538 | 0-070 69-210 | 0-043 
181-089 | 0-090 | 169-082 | 0-106 
159-227 | 0-079 | 225-251 | 0-141 
114-051 | 0-057 | 229-216 | 0-144 
52-435 | 0-026 | 194-919 | 0-123 
—13-199 |—0-007 | 140-516 | 0-089 
—26-853 |—0-013 | 51-111 | 0-032 
17-009 | 0-009 | —88-074 |—0-056 
89-160 | 0-045 | —240-431 |-—0-153 
164-033 | 0-082 |—330-713 |—0-210 























[ 290 ] 


TABLES FOR USE IN COMPARISONS WHOSE ACCURACY 
INVOLVES TWO VARIANCES, SEPARATELY ESTIMATED 


By ALICE A. ASPIN 


INTRODUCTION 


The tables are designed for use when the precision of an estimate, y, of a population para- 
meter, 7, depends linearly on two population variances, 7? and o?, the sampling variance of 
y being therefore of the form (A,o?+A,03), where A, and A, are known positive constants. 
If s? and s3 are estimates of a? and o3, based on f, and f, degrees of freedom, respectively, then 
the tables give, for two probability levels, critical values of the ratio 


v = (y—)/V(Ars} + A283). 

The tables are based on the assumptions of normal theory, the methods used in their 
calculation having been described by B. L. Welch (1947) and A. A. Aspin (1948). In par- 
ticular, it is assumed that s? and s2 are distributed independently of each other and of y. 

Even under the restrictive assumptions of normal theory, it is not possible to table critical 
values of v which are fixed independently of all the sample statistics. For a given probability 
level, €, these critical values depend on f, and f, and also on the ratio 


¢ = A, si/(Ay si + A259), 


i.e. on the relative magnitudes of the observed sample variances. The function to be tabled, 
therefore, involves four arguments and may be written 


fp » Ai | 
Ce wre we 
It has the property that the probability, 


(y—7) Aisi 
Pr Latha oor na m 

equals e, whatever may be the values of the unknown population variances, oj and o3. 

Two probability levels (¢ = 0-05 and ¢ = 0-01) are tabled on separate pages, the object in 
each case being to cover, in a small compass, a wide range of the other arguments involved. 
Direct linear interpolation should be used everywhere except, for each f, in the panel which 
includes f = oo. Here harmonic linear interpolation will be needed. 

Some remarks on the accuracy of the tables and of certain approximations, which may be 
used for other probability levels, are made later in an Appendix by B. L. Welch (in particular 
see § 6). 


EXAMPLE 


If (x, —a,) is the difference between the means of two normal populations whose standard 
deviations cannot reasonably be assumed equal; and if (%, — %,) is the difference between the 


two means of samples of sizes n, and n, taken respectively. from the populations; then we 
may put 


_ (4% —%y) — (% — %) ; Ai 8} a si/m, 
V(s{/m, + 83/mg) * (Ay si +Aqs3) — (s3/ny +83/n2)’ 


fi =(m-1) and f, = (n,—1), 





— 


{ 





.—_— 


ce 











291 


Atics A. ASPIN 


—5 exceeded with probability « = 0-05* 


253) 





(y—) 


7 V(Aysi+A 


Value of v 


Table 1. 






























































= ee ——————— 
| = | +O m ION oF +o m ION +H ID Ot +S mw 1S Ol ot ser eat ser ent | 
a et Li- tr & Qa DHIrr- ty & ort CLIrir- DS Sxrterrrs aL I-t- = oF tm i> & | 
| & | SeSSas Zaceta Sanne Saeee y eects awe 
RR Bintan : peel nalieoaed 
| | | 
j | ool oO mt 1D SNAMsL on OD mm i CONT Om CNDM=— CNET + | 
2 | SS2CqRTS SSR S SOOrEHS SEES Setrirtr Ss FSHOErS | 
| ° ee 5 ie oe Ale ie ——— << a <—— ——<—— 
| 
4 MaSsnos MWESNOH YOSENSY weaeznons. MOeOsneoe YoO=ts7 
% eS Okeke r eS Dee eS Omicini OC Oriike S Ceti ee > 
| = | re a a —— —~——— ~_—— 
| i Me — 2 ed oct oS ott oS osm oe! eswmcea eonmnmocisma 
a | - Litt t DS Drm tt > Li~x te ty SS Lir-hr oe Dir t-t- SS Hrimk iy S > 
Se | — ot ont pom oat sued od a re ——— —— —— <_< 
— | 
a ies tated” 
= lo eel ano) OOMONa oz CMmNS S t= cenoe es ll ee ed SA Li> 
{| > 4 | el Soll Soll Sell Sed Pail boll Sell Sell Sel od reer SS rere oS reece LS el oll Sl 
. ~ | —— et ee et et et eel —— te et — a a et — ee et ee 
i) | 
eo |— _|—__— neem i ———————— a 
=> » | HOD OD OD ODN MOON = © One oOoe o- COS ee ACSGiI~r st 
A= 3 “f ror & Soll Spell Soll Sal Sel Sad renee SS cree oS rire ofS crreeoewteS 
= | S — tt et eed — <r t e —_— a rom ~——— —_—— a 
| i Ss — i —— EEE 
~ 
| - SO OSSO WHMMMN BAAN m=aSOen =SSSELR anreonr 
>, | h lll Sel Seo Sal Sal Sd moet & = = eo & t rere 2S reoeoelos ee 
&, | pc ae ae a a ee ene 
| 
aN T 7 ~~ = —_ — —————— nh 
ons o | ecececocoooso ©osTwoo sh HOD 09 09 09 “OSS coe scs zz meweedge 
= “a |; ZDRDDDD PSE Sok Sell Sell Sell Sad ree e eh ft» Ep tC & f cero eo SS ee) 
od a a | ~<a << <—— a — a 
a =] i. —_ I —E— ——s eye a 
Cy 
a) Als IID DAADADA SHOGOSSS AAHARANRN ccosocoos 2Heeeer 
Ss _ DDHDDBDD tm i & i> & Po Soll Soll Sell Sell Sed Oe Sell Sell Sell Sel PS oe el Sell Sl Se wewovuvse 
S S ont oat pk oa vont oad — ot et eo ee) — te ——— —— 
& | See Sea PO: ae rae oc certs er od 
LY = 7 ° —aenenes — ee 
— om | sees fe 2S ANAAAA LOeeen OD 6 09 09 OD OF — et et 11018 wo et 
> . SARA DODD OD ree ee reer ft & cree et coeoLuevlese 
—— = a pat at ak vt sd —— —— —— << 
i) - alomaal SS ng SS 7 
~ a Hoh eH oh oh ot ~woeoostses et 19 19 19 1 1 ANNAAAN Hee ee | 
rr.) > AAADASS Denman nd DODD wDNA ree ne hh tt & fi Th 
— i | gag at oat oot ot ed nr) —— ——— ~<a <= 
| 
hy ty | 
ie  snxonse cxmoneog CwmoKvse 2none snore snonse | 
Sel kad iP Be nail ~~ mae Zaman fo een 8 = - eR = 2 | 
5 | ] N | 
= 4 ' } 
wi 4 i TT i i i! T 
S jeune - ~ ~ “ - ~ | 
% | ~~, ~ ~ ~ ~~, ~ 
i~< | 
ee ee ee | 
2 L o is S eS | 
= = N to) 
' 
| \! ll \| Il ll rT] | 
_° 7) _™ 1 _™ o j 
\ om BE Pa ee 
— tO - 
oe — OO 
as — —_— Nr  — i “ 
et . 
oa, - a | _ ' a 
— ow 
~ _ — A> fe 


= 
3 
~ 
= 
Fy 
x 
= 
+ 
= 
= 
= 
= 


LA 
Zz 
5 
Fd 
A 
Z 
= 
=~ 
e 
~ 
= 
4 
o 
4 
2 
= 
a 
& 
a 
o 
~ 
Be 
= 
Ss 
| 
2 
~ 
x 


~~ 
> 
= 
I 
= 
= 
a 
Tt. 
ao 
= 
— 
om 
= 
= 
= 
= 
- 
a 
_ 
— 
a 
= 
a 
oa 
5 
L 
i 
= 
a 
4 
— 
« 
= 
= 
Tz 
- 
— 
« 
= 
ZS 
<< 
os 
~ 
w. 
a 
Z 
= 
= 
=_ 
— 
= 
r 
7 
a 
i 
—] 
2 
= 
3 
z 
ral 
~ 
_ 
2 
= 
or 
of 
= 
_ 
be 
g 
_ 
= 
= 
= 
3 
o 
Sew 
- 
= 
= 
a 
Ye 
= 
= 
= 
~ 
a 
<j 
= 
= 
~ 
— 


x 
t 
4 
= 
= 
S 
= 
4 
c 
= 
= 
= 
“ 
rf 
e 
al 
= 
-_ 
% 
S 
= 
~ 
= 
‘ 
4) 
= 
= 
- 
A 
a 
= 
~ 
I 
o 
= 
3 
= 
= 
= 
~ 
£ 
= 
I 
~ 
a 
° 
= 
= 
| 
o 
= 
— 
II 
< 
= 
= 
| 
-~ 
= 
= 
II 
~ 
=, 


+A,¢}), and sj and s} 


based on f, and f, degrees of freedom, respectively. 


* y is normally distributed about 9 with variance (A,oj 


2 
2? 


mates of of and ¢ 


— it's). 


























292 Comparisons whose accuracy involves two variances 
Table 2. Value of v = ae ta Se exceeded with probability ¢ = 0-01* 
V (Ay 8} + Ags83) 
[or of | v| exceeded with probability 2c = 0-02) 
| 
Aisi 0-0 | O01 | O02 | O38 | 0-4 | 05 | O06 | O7 | O8 | OD | 1-0 
| Aysi+Aga} 

fo=10| f,=10 | 2-76 | 2-70 | 2-63 | 2-56 | 2-51 | 2-50 | 2-51 | 2-56 | 2-63 | 2-70 | 2-76 
12 2:76 | 2-70 | 2-63 | 2-56 | 2-51 | 2-49 | 2-49 | 2-52 | 2-57 | 2-62 | 2-68 

15 | 2-76 | 2-70 | 2-63 | 2-56 | 2-51 | 2-48 | 2-47 | 2-48 | 2-52 | 2-56 | 2-60 

20 | 2-76 | 2-70 | 2-63 | 2-56 | 2-51 | 2-47 | 2-45 | 2-45 | 2-47 | 2-49 | 2-53 

30 | 2-76 | 2-70 | 2-63 | 2-56 | 2-50 | 2-46 | 2-43 | 2-42 | 2-42 | 2-44 | 2-46 

oo 2-76 | 2-70 | 2-63 | 2-56 | 2-50 | 2-44 | 2-40 | 2-36 | 2-34 | 2-33 | 2-33 

fr=12| f,=10 2-68 | 2-62 | 2-57 | 2-52 | 2-49 | 2-49 | 2-51 | 2-56 | 2-63 | 2-70 | 2-76 
12 2-68 | 2-62 | 2-57 | 2-52 | 2-48 | 2-47 | 2-48 | 2-52 | 2-57 | 2-62 | 2-68 

15 | 2-68 | 2-62 | 2-57 | 2-52 | 2-48 | 2-46 | 2-46 | 2-48 | 2-52 | 2-56 | 2-60 

20 | 2-68 | 2-62 | 2-57 | 2-52 | 2-48 | 2-45 | 2-44 | 2-45 | 2-47 | 2-49 | 2-53 

30 | 2-68 | 2-62 | 2-57 | 2-52 | 2-47 | 2-44 | 2-42 | 2-41 | 2-42 | 2-44 | 2-46 

0 2-68 | 2-62 | 2-57 | 2-51 | 2-46 | 2-42 | 2-38 | 2-36 | 2-34 | 2-33 | 2-33 

fe=15 | fy=10 | 2-60 | 2-56 | 2-52 | 2-48 | 2-47 | 2-48 | 2-51 | 2-56 | 2-63 | 2-70 | 2-76 
12 2-60 | 2-56 | 2-52 | 2-48 | 2-46 | 2-46 | 2-48 | 2-52 | 2-57 | 2-62 | 2-68 

15 | 2-60 | 2:56 | 2-51 | 2-48 | 2-45 | 2-45 | 2-45 | 2-48 | 2-51 | 2-56 | 2-60 

20 | 2-60 | 2-56 | 2-51 | 2-48 | 2-45 | 2-43 | 2-43 | 2-44 | 2-46 | 2-49 | 2-53 

30 2-60 | 2-56 | 2-51 | 2-47 | 2-44 | 2-42 | 2-41 | 2-41 | 2-42 | 2-44 | 2-46 

2 2-60 | 2-56 | 2-51 | 2-47 | 2-43 | 2-40 | 2-37 | 2-35 | 2-34 | 2-33 | 2-33 

fe=20| f,=10 | 2-53 | 2-49 | 2-47 | 2-45 | 2-45 | 2-47 | 2-51 | 2-56 | 2-63 | 2-70 | 2-76 
12 | 2-53 | 2-49 | 2-47 | 2-45 | 2-44 | 2-45 | 2-48 | 2-52 | 2-57 | 2-62 | 2-68 

15 | 2-53 | 2-49 | 2-46 | 2-44 | 2-43 | 2-43 | 2-45 | 2-48 | 2-51 | 2-56 | 2-60 

20 | 2-53 | 2-49 | 2-46 | 2-44 | 2-42 | 2-42 2-42 | 244 | 2-46 | 2-49 | 2-53 

30 | 2-53 | 2-49 | 2-46 | 2-44 | 2-42 | 2-40 | 2-40 | 2-40 | 2-42 | 2-43 | 2-46 

wa 2-53 | 2-49 | 2-46 | 2-43 | 2-40 | 2-38 | 2-36 | 2-34 | 2-33 | 2-33 | 2-33 

fe=30 | f,=10 | 2-46 | 2-44 | 2-42 | 2-42 | 2-43 | 2-46 | 2-50 | 2-56 | 2-63 | 2-70 | 2-76 
12 | 2-46 | 2-44 | 2-42 | 2-41 | 2-42 | 2-44 | 2-47 | 2-52 | 2-57 | 2-62 | 2-68 

15 | 2-46 | 2-44 | 2-42 | 2-41 | 2-41 | 2-42 | 2-44 | 2-47 | 2-51 | 2-56 | 2-60 

20 | 2-46 | 2-43 | 2-42 | 2-40 | 2-40 | 2-40 | 2-42 | 2-44 | 2-46 | 2-49 | 2-53 

30 2-46 | 2-43 | 2-42 | 2-40 | 2-39 | 2-39 | 2-39 | 2-40 | 2-42 | 2-43 | 2-46 

x 2-46 | 2-43 | 2-41 | 2-39 | 2-37 | 2-36 | 2-35 | 2-34 | 2-33 | 2-33 | 2-33 

fr=x fi=10 | 2-33 | 2-33 | 2-34 | 2-36 | 2-40 | 2-44 | 2-50 | 2-56 | 2-63 | 2-70 | 2-76 
12 2-33 | 2-33 | 2-34 | 2-36 | 2-38 | 2-42 | 2-46 | 2-51 | 2-57 | 2-62 | 2-68 

15 | 2-33 | 2-33 | 2-34 | 2-35 | 2-37 | 2-40 | 2-43 | 2-47 | 2-51 | 2-56 | 2-60 

20 | 2-33 | 2-33 | 2-33 | 2-34 | 2-36 | 2-38 | 2-40 | 2-43 | 2-46 | 2-49 | 2-53 

30 | 2-33 | 2-33 | 2-33 | 2-34 | 2-35 | 2-36 | 2-37 | 2-39 | 2-41 | 2-43 | 2-46 

oo 2-33 | 2-33 | 2-33 | 2-33 | 2-33 | 2-33 | 2-33 | 2-33 | 2-33 | 2-33 | 2-33 






































* y is normally distributed about 7 with variance ,/(A,o?+A,03), and s? and s? are independent esti- 
mates of oj and oj, based on f, and f, degrees of freedom, respectively. A, and A, are known constants. 
In the problem of comparing the means of samples taken from two normal populations, put y = (%, — 2), 


fi = (m— 1), fo = (m2—1), Ay = 1/n, and A, = 1/ng, where n, and n, are the sample sizes. 

















Autce A. ASPIN 293 


where s? and s3 are the sample estimates of variance. The tables may then be used to make 
inferences about (a, — a). 

Thus if n, = 10, n, = 15, %, = 73-4, X, = 47-1, s} = 51, and sj = 141, we shall have 
(%,—Z_) = 26-3, f, = 9, fy = 14 and 


_ 26-3—(a,— Qe) Aisi 


: = 0-352. 








From the tables v ee: care 0-05} = v{9, 14, 0-352, 0-05} = 1-71. 
If it were a matter of testing the consistency of the data with the hypothesis that a, = a, 
we should have v = 26-3/3-81 = 6-90, which is clearly far beyond the tabled 5 % point. 

In obtaining an interval estimate for (2, —«,) we should note that in the above description 
of the tables we have been dealing explicitly with one-sided probabilities. The chance that 
v exceeds the tabled value numerically, either in the positive or negative direction, is 2e. Thus 


26-3 — (a, — 
Pr. | - 1-71 < SS no). | = 0-90, 


i.e. Pr. [19-8 < (a, — a) < 32-8] = 0-90, 


i.e. the 90°, confidence limits for («, —«,) are 19-8 and 32-8. 


APPENDIX 


FURTHER NOTE ON Mrs ASPIN’S TABLES AND ON CERTAIN 
APPROXIMATIONS TO THE TABLED FUNCTION 


By B. L. WELCH 


1. More than one solution has been proposed for the problem of dealing with comparisons 
whose estimated precision involves two separate estimates of variance. The present tables 
are based on a solution which relies only on probability calculations of the so-called ‘direct’ 
variety. I have given elsewhere a method of developing this solution in a power series in 
1/f, (Welch, 1947), and further terms in the series have been given by A. A. Aspin (1948). 
In calculating the tables, Mrs Aspin has in general utilized the series as far as terms of order 
1/f%, and, in certain doubtful cases, has also evaluated terms of order 1 /f?. Taking the series 
thus far, it is possible to give v to two decimal places down to the lowest values of f, and f, 
shown in the tables, but, for smaller f, and f,, two-decimal accuracy could not, in general, 
have been guaranteed. For the larger values of f, and f, more figures could, of course, have 
been given, but there are advantages, in a table of this character, in keeping to the same 
number of figures throughout. 

It is natural to ask what degree of accuracy in the probability is implied by the use of 
values of v rounded off to two decimal places. For f, and f, large, this question can easily be 
answered. For then v tends to be normally distributed, and, on the normal curve, two- 
decimal accuracy in the deviate, in the vicinity of ¢ = 0-05, implies three-decimal accuracy in 
the probability of the deviate being exceeded. In the vicinity ofe = 0-01, two-decimalaccuracy 
in the deviate implies errors in the probability of not more than 2 units in the fourth decimal 
place. 








294 Appendix to Mrs Aspin’s paper 


For lower values of f, and f, it is not possible to be dogmatic, although it is very often the 
case, when we are using series of the present type, that the greater the struggle needed to 
obtain extra decimal places in the deviate, the less will be the effect upon the probability itself 
of any rounding-off error in the deviate. Although I believe this to be true in the present 
instance it has seemed to me, nevertheless, advisabie to make some direct calculations, using 
quadratures, to test out the question thoroughly. Accordingly for the lowest values f, = f, = 6 
given in the table for ¢ = 0-05, I have made some calculations of this nature and will present 
the results below. It will be convenient at the same time, for the same particular case, to 
give the results of similar calculations which throw light on the accuracy of certain simple 
approximations to the tabled function. 


2. If we are confining ourselves to a given pair of values, f, and f,, and to a given pro- 


2 
bability level, ¢, the function os : pe. . Ge , €} need, momentarily at least, be regarded 
[oP "? Ay? + Ase 


only as a function of the one argument c = A,83/(A,s?+A,s3) and may conveniently be 
denoted simply by v(c). Since it is not the function v(c) itself that we are concerned with, but 
the tabular values obtained after rounding off v(c) to two decimal places, it will be convenient 
to have a further symbol v,(c), say, to denote these rounded-off values. In the first line of 
Table A we reproduce from Mrs Aspin’s Table 1, the values v,(c) for the case f, = f, = 6, 
€ = 0-05. Let us now consider the probability that v exceeds v(c). 


Table A. Critical values of v (f, = f, = 6; € = 0-05) 








ce=| 0-0 or 1-0 | 0-1 or 0-9 | 0-2 or 0-8 | 0-3 or 0-7 | 0-4 or 0-6 0-5 
Tabled values v,(c) 1-94 1-90 1-85 1-80 1-76 1-74 
Approximation v,(c) 1-94 1-88 1-84 1-81 1-79 1-78 
Approximation v}(c) 1-94 1-87 1-82 1-79 “17 1-76 





























If Pr. {v>v7(c)} depends on o? and o3 at all, it must depend on them through the ratio 
y = A,o7/(A,o} + A,o3). If, then, we take different values of y, and calculate for each one, 
directly by quadrature, the value Pr. {v>v,(c)}, we shall have solved our problem. This 
calculation by quadrature is heavy, although it is lightened considerably by the use of some 
rather neglected tables of the probability integral of ‘Student’s’ distribution (Gosset, 1925). 
It is not proposed to enter into the details of the calculations, but it should be mentioned 
that a rather broad network was used in the quadrature and that the last decimal place given 
below cannot, on this account, be fully guaranteed. 

The following results were obtained. For y = 0-1, 0-2, 0-3, 0-4 and 0-5 the respective 
values for Pr. {v > vz(c)} were 0-0501, 0-0500, 0-0500, 0-0498 and 0-0498. It is clear, therefore, 
that to three decimals we certainly have Pr. {v > vp(c)} = 0-050, whatever y, and indeed that 
the error can, at most, be only a unit or two in the fourth decimal place. It seems, therefore, 
safe to assume that throughout Table 1 the rounding off to two decimal places in the deviate 
leaves us easily with three decimal places in the probability, 0-050, that the deviate is 
exceeded. Similarly in Table 2 we can almost everywhere expect that the probability that v 
exceeds the tabled values will be 0-0100 to four decimal places. 


—— 


~~" 








— — — | 





B. L. WELCH 295 


3. The provision of tables of v covering a large number of probability levels would clearly 
be a major task, and possibly one which would not justify the labour involved. It is therefore 
of some importance to consider the order of accuracy attainable with any approximation 
which only utilizes existing tables. A sim»ie approximation of this character is to take for 
the critical v the value given in a ‘Studer. { table corresponding to degrees of freedom F, 


where 1/F = {c*y, +(1—c)?/fy}. 


Consideration of the series development of the ‘Student’ deviate shows that this procedure 
is legitimate if terms of order 1/f? can be neglected. 

Numerically, in the particular case f, = f. = 6 with e = 0-05, this approximation is equi- 
valent to taking for v(c) the values v,(c) shown in the second line of Table A. By direct evalua- 
tion, using quadratures, it is possible now to obtain the probability that v >v,(c). Results of 
this calculation, for different y, are shown in the second line of Table B. In this case an error 





























Table B 
y=| 00orl-0 | Olor0-9 | 0-2o0r08 | O-3o0r0-7 | 04 or 0-6 0-5 
Pr. {v> vp(c}! 0-050 0-050 0-050 0-050 0-050 0-050 
Pr. {v>v,(c)} 0-050 0-051 0-050 0-049 0-048 0-048 
Pr. {v>v,(c)} 0-050 0-052 0-051 0-051 0-050 0-050 











of one or two units in the third decimal place of the probability is apparently possible with 
this approximation. 

As f, and f, increase, the approximation using the ‘Student’ table improves, and for 
practical purposes this procedure is therefore extremely useful. The approximation can also 
be used in conjunction with the above-mentioned tables of the probability integral of the 
‘Student’ distribution (Gosset, 1925) to give actual probabilities, if it is felt that reference 
to a few percentage points is not sufficient. 


4. Toorder 1/f;, the above-described procedure will not be altered if we enter the ‘Student’ 
t table with some other number of degrees of freedom, F’ (say), which differs from F only to 
order 1/f?. A possible modification (see Welch, 1947, p. 32, equation (29)) is to take F’ given by 


1/(F’ +2) = {c?/(f, +2) + (1 —c)*/(f,+ 2)}. 


A form like this is suggested by the behaviour of the successive groups of terms in the true 
series solution. An expansion of this series in terms of 1/(f;+2), 1/(f;+2)(f,+4), ete., 
instead of powers of 1/f; appears to introduce some simplification into the algebra. Numerical 
calculations, however, show this gain to be largely illusory (Aspin, 1948), and the corre- 
sponding ‘Student’ approximation, using F’, not to have the advantages expected. 

In the third line of Table A are shown the values v}(c) which result when F’ is used, and the 
third line in Table B shows values of Pr. {v > v{(c)} calculated by quadrature. In the neigh- 
bourhood of y = }, F’ has some advantage, but, taking the overall picture, the accuracy 
given is of the same order as that given by using the F of the previous section. Since F is, 
perhaps, rather simpler to use, we have therefore no very good reason to introduce a modi- 
fication of the present nature. 











296 Appendix to Mrs Aspin’s paper 


5. Approximations, utilizing tables of the ‘Student’ distribution and agreeing with the 
true solution to terms of order 1/f3 or higher, can be given. The labour involved in using them, 
however, begins to approach that involved in working out the true series solution term by 
term, and there seems, therefore, little point in describing them. 


6. Summary. On the basis of the above calculations for the case f, = f, = 6 and the known 
behaviour of the distribution of v when f, and f, are large, it seems safe to say, that, although 
the values of v given in Table | are rounded off to two decimal places, the probability of the 
values thus tabled being exceeded will always equal 0-050 to three decimal places. Usually, 
indeed, the accuracy of the probability will be even better than this. 

Similar considerations for Table 2 suggest that the probability of the given values of v 
being exceeded will almost everywhere equal 0-0100 to four decimal places. 

The above calculations also throw some light on an approximate method which consists 
in taking for the required v, the values in a ‘Student’ ¢ table corresponding to F degrees of 


freedom, where 1/7 _ {c2/f,+(1—c)%/f,} and ¢ = Ays}/(Aye8+A,8}). 


If this approximation be used for values of the arguments not covered by Mrs Aspin’s 
tables, or in cases where actual probabilities are required rather than references to fixed 
probability levels, it usually gives good results. 


REFERENCES 


AspIn, A. A. (1948). An examination and further development of a formula occurring in the problem of 
comparing two mean values. Biometrika, 35, 88-96. 

Gosset, W. 8. (1925). New tables for testing the significance of observations. Metron, 5, 105. (Re- 
printed in ‘Student’s’ Collected Papers, Cambridge University Press, 1942.) 

WE cu, B. L. (1947). The generalization of ‘Student’s’ problem when several different population 

variances are involved. Biometrika, 34, 28-35. 





[ 297 ] 


BIVARIATE DISTRIBUTIONS BASED ON SIMPLE 
TRANSLATION SYSTEMS 


By N. L. JOHNSON 


1. INTRODUCTION 


In a previous paper (Johnson, 1949) systems of frequency curves based on transformations 
of the form 


2= +a") = 7+ 4, 


where z is a unit Normal variable, f is a function, and y, 4, £ and A are parameters, have been 
discussed. The special systems 


S, corresponding to f(y) = logy (“log Normal’), 
S», corresponding to f(y) = log ‘wes ‘ 


and Sy corresponding to f(y) = log [y+./(y?+ 1)] 
were considered in some detail. 

The present paper is concerned with certain bivariate distributions which can be con- 
structed on the basis of univariate distributions of systems S,, S, or S,, together with the 
Normal distribution, denoted below as Sy. 

If it be supposed that the unit Normal variables 


4 =Vt fi), 2% = Yet Fefelye) (1) 
have the joint Normal bivariate distribution 
(2,2) = [27 /(1 —p*)}* exp {— (1 —p*)* (2 — 2pz, 2, + 23)}, (2) 


the joint distribution of y, and y, is thereby determined. While we cannot, of course, say that 
the joint distribution of z, and z, must be of form (2), it is an assumption which should be 
reasonable in many cases. With equal cogency, therefore, it can be argued that the joint 
distribution of the variables y, and y,, defined by (1), should be that implied by (2). 

The transformations for y, and y, need not belong to the same system. Thus, restricting our- 
selves to the Normal system, S,, and the systems S,, S, and S, we have ten different systems 
of bivariate distributions for y, and y,. We will denote these systems by symbols Sy, Sy;, 
Sp, ete. Syy is the bivariate Normal distribution; S,; is the ‘logarithmic surface’ con- 
sidered by Wicksell (1917); Sy, is the ‘semi-logarithmic surface’ considered by Yuan (1933). 
Allowing for choice of dependent and independent variable, there will be sixteen types of 
regression line (one being the linear regression corresponding to the bivariate Normal case). 


2. COMPASISON WITH OTHER BIVARIATE SYSTEMS 


It is of interest to compare the principle employed above (transformation to Normality) 
with other principles which have been suggested as starting points in the construction of non- 
Normal bivariate distributions. Steffensen (1922, 1941) suggested that the joint distribution 
P(Y, Ye) might be supposed to be of the form 


91(4,Y1 + 4aYo) Yo(b,y, + bgyq), 








298 Bivariate distributions based on simple translation systems 


where @, @, b,, b, are constants and g,, g,, are univariate probability density functions. This 
is equivalent (Frechet, 1941) to supposing that independent ‘primary variables’ can be found 
which are linear functions of the y’s. K. Pearson (1923), in a survey of methods of construc- 
tion of bivariate distributions, states that this method does not accord well with observed 
distributions. In the same paper, reference is made to a number of bivariate distributions 
arising from special cases of a generalization of the differential equation which generates the 
univariate Pearson system of curves (see also Rhodes, 1922). Reference is also made to con- 
temporaneous work by Narumi (1923), wherein bivariate distributions are derived from 
certain limitations on the shape of the regression lines and on the type of heteroscedasticity 
allowable. Pretorius (1930) showed that Narumi’s assumption of constant £, and /, in the 
array distributions is not justifiable in much observational data. Pretorius also considered 
bivariate distributions based on Charlier curves, on Edgeworth’s method of translation, 
and on a development of our system S,,. Recently Van Uven (1947, 1948) has developed 
a complete system of surfaces generalizing the Pearson univariate system. 


3. GENERAL PROPERTIES OF THE S;, SYSTEMS 
(a) Array distributions. The system S,, is defined by the equations 
y IJ j 


4 = +61), 2 = Y2t Sofs(Ye) (3) 
(f, and f,; referring to the systems S, and S, respectively, J, J = N, L, B or U), and by the 
correlation, p, between z, and zy. 
If y, be fixed, so is z,. Fora fixed value of z,, z, is distributed Normally with expected value 
pz, and standard deviation (1 —,*)'. Thus, for y, fived, y.+4.f;(y2) is distributed Normally 
with expected value p{y, + 4,f;(y,)} and standard deviation (1 —*)*. Hence 


[¥2+ Saf'7(Y2) — Ply + O/r(Y)}] (l — p?)-! 


is a unit Normal variable. Thus, given y,, y, has a distribution of the same system, S,, as its 
marginal distribution, with 


Y2 replaced by [y2—p{yi t+ 4,fr(ys)}] (l—p?)* 
and 6, replaced by 6,(1 —p?)-+. 


Since f;(y,) varies from —0oo to +, it is clear that (except when p = 0) the quantity 
replacing y, will change sign at some point in the range of variation of y,. If S; be a sym- 
metrical transformation then the array distributions to either side of this value of y, will be 
skew in opposite directions. Variation in skewness of this kind was present in some of the 
examples used by Pretorius. It may, of course, so happen that the transition point is so far 
out in the tail of the y, distribution that the reversal effect is negligible. 

It may also be noted that the quantity replacing 6, does not vary with y,. If the dis- 
tribution of y, be S, or Sy, the (f,, 62) points of the array distributions all fall on a constant 
é line; if the distribution of y, be S,, the (f,, 8.) point does not vary, so that the array dis- 
tributions are all of the same shape, though they are not homoscedastic. 

A case of particular interest arises when the marginal distribution of y, is in the system Sp. 

The array distribution of y., given y,, will be bimodal provided 
2 


eo (4) 


| er | b, v(t —p?) 
(cf. equation (26) of Johnson (1949)). 


ra=elta + fitny lh—pP— 20) 3h tanh |! =eP— 28 





t) 


N. L. JoHNSON 299 


We observe that 

(i) If p?> 1— 263, then none of the array distributions are bimodal. 

(ii) There are always some unimodal array distributions. Bimodal array distributions, 
if any, correspond to values of y, in a single finite interval. 

(b) Median regression. Since the array distribution of y,, given y,, is in the system S, to 
which the marginal distribution of y, belongs, it follows that the expected value of y,, 
given y,, may be expressed as an explicit function of y, using formulae derived by Johnson 
(1949). This would give the regression curve of y, on y,. The expression obtained would, 
however, in general, be complicated and unsuited to further analytical study. 

Much simpler and more informative formulae are obtained if we write down the expression 
for the median value of y,, given y,. The possibility of using the median regression was men- 
tioned by Narumi (1923), though he did not pursue the idea. 

The median value 7, of y,, for a fixed value of y,, satisfies the equation 

Yet Oof7(Ge) = ply t+ Ofrly)) 
i.e. S3(G2) = (P¥1— ¥2) 82" +08, 62 *Fi(M)- (5) 

The median regression will be linear if pd, = 6, py; = 2 and f,=f,;. The only other 
case giving linear regression is f;(y) =f,;(y) =y which is Syy, the Normal bivariate 
case. 

Table | shows the median regression equations for the sixteen possible cases covered by the 
Sp, Sy, S; and Normal marginal distributions. For convenience the parameters 


0 = dern-yb2, 6 = pd, /b, (6) 
have been introduced. 
Table 1 
Distribution of Median regression of y, on Y; 
ve Fe 
Sy Sy Je = logd+oy, 
S, S, log0+¢logy, 
S, Sy 0 ee” 
Sy Sz log @ + ¢ log {y,/(1 —y)} 
Ss Sy 1/(1+0-'e-o™) 
Sy Sy log6+¢log[y, + V(y3+ 1)] 
Sv Ss \(0 co" — 9-1 6-4) 
S, S, Oy? 
S, Sz O{y,/(1—y)}* 
S, S, 1/(1+0-y;?) 
S, Se Ay, + Viyz+ DIP 
Sy S, 4(Oy? — f-ly7 4) 
Sp Sz Oy? /{(1 —y,)* + Oyf} 
Sp Se {1+0 1V(yi+1)—yh}" 
S, Sz Oy} — O-"(1 — y,)*} yA — > 
Se Sv HOLy, + V(y3 + I) — OL V(yi + 1)—y,}*} 


The general appearance of these curves is indicated in Figs. 1-9 (those involving Normal 
marginal distributions in respect to either y, or y, are omitted). It will be noted that in many 
cases the general slope of the regression depends on the range of values in which ¢ lies. The 
case <0 will arise when p<0, as we have assumed throughout this paper that ¢ is 
positive. 








300 Bivariate distributions based on simple translation systems 


yz 


<0 


yz 
¢>1 
0<¢<!1 
" fe) yn 
S: on S 
Fig. 1. 








Fig. 2. 





Ss on S 


Fig. 3. 


The examples in Figs. 1 and 2 should suffice to show the effect of reversing the sign of ¢. 











Ss on Sy 


Fig. 7. 

















g>1 ad 
Py 
7 0<¢g<1 
Sy on S; 
Fig. 5. 
>! ad 
a 
77 0<o<1 
1 
vy, 
0<$<1,/ 
i) 0>1 





Sy on Sy 


Fig. 9. 





It 


an 


th 


N. L. Jonnson 301 
4. PERCENTAGE ZONES IN THE S;, SYSTEMS 


It is, of course, a simple matter to construct regions in the (y,,y,) plane which will contain 
any desired proportion of the frequency distribution of y, and y,. If k be the number defined by 


1 fk 
= dt = a 
2 [- : 
then, for a fixed value of y,, V (2m) J —% 








—k—[y.—P{1 + 8: f1(ys)}] (l1—p?) + k—[yo—p{¥1 + 8.f1(y,)}) (1—p?)-4) _ 
” apy <I) 8 — ptt }=« 
Hence the curves + k(1—p*)}t+y_4+6ef7(¥e) = ply t+ fry) (7) 


will enclose a zone containing a proportion « of the total frequency surface. This zone may 
be regarded as a ‘percentage zone’ for y, given y,. It will be noted that the boundary curves 
(7) will be of the same type as the median regression of y, on y,. They will have the same 
‘d’ (= pé,/6,) as the median regression, but the values of ‘0’ will be 


exp [63 {p71 — 72+ k(1—p*)}}]. 
Naturally there will be another similar zone containing a proportion « of the total fre- 
quency forming a ‘percentage zone’ for y, given y». 
A third region containing a given proportion of the total frequency, which is some interest, 
is that obtained from the transformation of the corresponding x? ellipse: 


22 — 2pz,2z,+ 23 = const. (8) 


Substituting from (1) in (8) will give the equation of the boundary of a region in the (y;, ¥2) 
plane which will contain the same proportion of total frequency as does the x? ellipse in the 
(z,,2_) plane. 

5. METHOD OF FITTING 


The most straightforward way of fitting a joint distribution of the kind proposed above seems 
to be as follows: 


(i) Fit each marginal distribution by a curve from one of the four systems (Normal, 
Sz, Sp, Sy). 

(ii) Calculate the observed correlation between the transformed variables, using the results 
of (i) to obtain the latter. 

The functions f; and f,, and the values of all the parameters except p, are determined by (i); 
(ii) determines p. 

The example below illustrates this method of fitting. It will be noted that in this example 
the higher order moments of the arrays are not well reproduced by the fitted surface. It is 
well to remember, however, that Pretorius (1930) obtained results which were scarcely as 
good when he fitted a surface of Jorgensen’s Type AA to the same data. The equation of this 
surface contains fourteen arbitrary constants, while the S,y surface fitted in the example 
below depends on only nine arbitrary constants. 

Example. In examples 3 and 4 of Johnson (1949), curves of system S,, have been fitted to 
distributions of length and breadth, respectively, of 9440 beans. We now proceed to fit a 


surface of system Sy to the joint distribution of length and breadth. The correlation between 
the transformed variables 


length — 16-0745 
1-5192 








sinh ‘i aan-2 (CRN 0S 


0-9721 








302 


Mean breadth of beans in mm 


A: of arrays of breadth of beans 


Regression 
Breadth on length 








675 ain ee | 











1 
17 14 15 14 13 


Length of beans in mm. 


12 11 #10 


$.0. of arrays of breadth of beans 


Bivariate distributions based on simple translation systems 


Skewness 


Breadth on length 
Scedasticity 


Breadth on length 





° 





breadth of beans 


(unit = 4 mm.) 

















07 


rt 4. .' in — 
we Se Bw 
Length of beans in mm. 








1 1 i 4 i 
wT @ WwW Mw @ 
Length of beans in mm 


/ B; of arrays of 
3 






































Fig. 10. Fig. 11. Fig. 12. 
Regression Scedasticity 
Length on breadth Length on breadth 
17 20 
Kurtosis E rod 
re a ae 
Breadth on lengch E 6 S “ ] 
40 < 45+ 1 3 we ae 
Z > 
- me E 
38 eo. 3 144 4 © E 4-4b : 
° S ari ° 
36F : 2 Wt : 6 2b : 
> > § > 
34F = 1227 2 1 = ~~ 10F © 1 
6 e ss () 
32F ~ ae 5 : 11- 4 > 0 
0 ine + + 10 i i 1 1 4 4 ad 0 
6 i. iL i i 
17:16 15 14 13 12 95 4 B85 80 75 70 65 60 90 85 80 75 70 65 
Length of beans in mm 


Fig. 13. 


Br 


eadth of beans in mm. Breadth of beans in mm. 


Fig. 14. Fig. 15. 


Kurtosis 
Length on breadth 























5 
Skewness 3 
: Length on breadth S 7 
& -08 s 7 
o oO 
& c 
So ~O6F syle 4 = - 
$s bd Ss 
bo ° ” 
§ ~O4F ff : > 30F 1 
3 o°e 5 . 
~02F : = 286F 7 
o 6 
¢ ° ° ~ 
Se a ee Seaes QR 26F . 4 
es 4 4 4 4 24—— ——" 
a 90 85 80 75 70 90 85 80 75 70 
- 


Fig. 16. 


Breadth of beans in mm. 








Breadth of beans in mm. 


Fig. 17. 





is | 
fit 








N. L. JoHnson 303 


is found to be 0-746. Hence, denoting length by 2, and breadth by z,, the constants of the 
fitted S,-;- surface are 


y, = 2:38, 6, = 2-64, £, = 16-0745, A, = 1-5192, 
¥2= 2:13, 6, = 3°55, & = 86195, A, = 0-9721, 
p = 0-746. 


Consider now the array distribution of x, (breadth) for a fixed value of x, (length). This will 
be a curve of system S,, with parameters 
ea | 


¥(x_| 2) = {ye ers PA sinh™ (=*)| (l—p*)?, d(x, | %,) = 6,(1—p?)-*. 


Hence, inserting numerical values, 
6(a_ | 2) = 5-3307 
and Q(x | xy) = y(xg | 2,)/8(xe | 7) 
= 65 "(¥2—py1) — p98, 63 *sinh™ sa) 


x, — 16-0745 
= 0-104 —0-5548 si 1/4 
0-5548 sinh | 15192 


Also, of course, E(x, |2,) = &, = 86195, A(x,| 7.) = A, = 0-9721. 


Using these values of the parameters the expected value, standard deviation, #, and ~, 
of the array distributions of breadth were calculated. The results are shown graphically in 
the continuous curves in Figs. 10-13. The ringed dots show the corresponding values cal- 
culated from the observed array distributions by Pretorius. Figs. 10-13 are to be compared 
with that author’s diagrams VI (a)-(d); to facilitate the comparison they have been 
constructed in a similar manner. 


For array distributions of x, (length) for a fixed 2, it was found that 


8(x, | %_) = 3-9642, 
Xr — 86195 
» fae) ae =, © mm tS pst SNe lea sn 
Q(x, | 2) = 0-300 — 1-0031 sinh ( 00721 ), 


E(x, | x_) = £, = 16-0745, A(x, |x) = A, = 1-5192. 


Figs. 14-17 are analogous to Figs. 10-13 and compare curves giving the theoretical values of 
constants of the array distributions of 2, with the observed values (ef. Pretorius’s diagrams 
VI (e)-(h)). 

From a study of the eight diagrams (Figs. 10-17) it appears that agreement between theory 
and observation becomes less satisfactory as higher moments come into consideration. The 
regression lines are quite close fits. The scedastic lines, while not giving a close fit, seem to be 
fairly reasonable, and are about as good fits as the Type AA lines shown by Pretorius. In 
the case of the lines showing skewness and kurtosis the most that can be said is that they 
do indicate the general trend and order of the observed values. A peculiar feature is that it 
seems that in all four cases the fit of these lines would be improved if they were displaced to 
the right (i.e. in the direction of decreasing length or breadth). 

It should be noted that Pretorius did not calculate the /f, and £, coefficients for the 
arrays of the Type AA surface. 


Biometrika 36 20 








304 


Bivariate distributions based on simple translation systems 


REFERENCES 


FreEcuHET, M. (1941). Skand. Aktuar. 24, 214. 

JouNSON, N. L. (1949). Biometrika, 36, 149. 

Narvumt, 8. (1923). Biometrika, 15, 77. 

Pearson, K. (1923). Biometrika, 15, 222, 231. 

Pretorius, 8. J. (1930). Biometrika, 22, 109. 

Ruopes, E. C. (1922). Biometrika, 14, 355. 

STEFFENSEN, J. F. (1922). Skand. Aktuar. 5, 73. 
STEFFENSEN, J. F. (1941). Skand. Aktuar. 24, 1. 

Van Uven, M. J. (1947). Ned. Akad. Wet. Proc. 50, 1063, 1252. 
Van Uven, M. J. (1948). Ned. Akad. Wet. Proc. 51, 41, 191. 
Wicksett, S. D. (1917). Ark. Mat. Ast. Fys. 12, no. 20. 
Yuan, P. T. (1933). Ann. Math. Statist. 4, 30. 





] 
1 


[ 305 ] 


A TEST FOR RANDOMNESS IN A SEQUENCE OF TWO 
ALTERNATIVES INVOLVING A 2x2 TABLE 


By P. G. MOORE, University College, London 


1. INTRODUCTION 


Recently the power function technique has been used to examine tests of randomness applied 
to a sequence of two alternatives. F. N. David (1947) has considered the power function of 
what may be termed the ‘group’ test, and G. I. Bateman (1948) has similarly considered the 
‘longest run’ test. In both cases the hypothesis tested, say H,, is that there is randomness 
within the sequence against the admissible alternative, H,, of dependence of the kind found 
in a simple Markoff chain. We propose here to deal with a slight variation of the group test 
which leads to a 2 x 2 table. We shall then examine the power of this test against the alter- 
native of positive dependence in the sequence of the type met with in a simple Markoff chain. 
This will enable us, before some set of data is collected, to find approximately how large a 
sample is necessary in order to be able to detect a specified degree of dependence using a given 
significance level. 

In the sequence of alternatives we write | for the happening of a certain event and 0 for 
its negation. For example, in a sequence of tyres coming off a manufacturing plant, a 1 might 
indicate a tyre rejected as not being up to standard. We would be interested here in seeing 
whether the rejected tyres were randomly placed in the sequence or whether there was an 
undue clustering effect of the rejected tyres. Taking successive couplets along the sequence, 
the usual procedure is to represent the result in a 2 x 2 table using the notation given in 
Table 1. The value of V will obviously be one less than the number of items in the sequence 
and also r and m will not differ by more than unity. Our test will be concerned with the form 
of this table under (i) randomness and (ii) dependence of the simple Markoff type. 














Table 1 
Present tria! 
1 0 Total 
Preceding 1 a c m 
trial 0 b d n 
Total r 8 N 




















2. PRESENTATION OF DATA 


The preceding manner of arranging the data has been used, among others, by Cochran (1938). 
He then compared the two proportions a/m and b/n, using the standard test for this kind of 
data. However, for a given sequence, the values in the 2 x 2 table formed are not uniquely 
fixed by the numbers of 1’s and 0’s in the sequence. To clarify the position we may consider 


20-2 








306 Randomness in a sequence of two alternatives 


how the various cases arise. Suppose that in a sequence of R units there are r, 1’s and r, 0’s. 
Obviously r,+7r, = R. Four cases may now be distinguished. 

A. Sequence begins with a 1 and ends with a 0. 

B. Sequence begins with a | and ends with a 1. 

C. Sequence begins with a 0 and ends with a 0. 

D. Sequence begins with a 0 and ends with a 1. 


A and D must each have an even number of groups, say 2¢, in the sequence, while B and C 
will have an odd number, say 2¢ + 1. These cases give rise to the 2 x 2 tables shown in Table 2. 






















































































Table 2 
Case A Case B 
Present trial Present trial 
———- = = 7 
1 0 Total 1 0 Total | 
Preceding l r,—t t r; | Preeeding l r,—t—1 | t r,—1 | 
trial 0 t-—1 r,—t re—] trial 0 t ra—t Te 
Bisieepts iad Pat op oatoae a ere 
Total mn—-l | r, | R-I Total r,—l fs | R-1 
| | 
Case C Case D 
Present trial Present. trial 
‘ | 
1 0 Total l | 0 Total | 
ert } a 
| | 
Preceding 1 r,—t t r; Preceding ] r,—t t—1 r,—1 
trial 0 t re—t—1] r.-1 trial 0 t Te-t | Pe 
L <4 fe] 
Total | or, ro—1 R-1 Total | 7, rT.—1 R-1 | 
L elt oi _ Fees: 














It is clear that, although the data have been exhibited in the form of a 2 x 2 table, the under- 
lying criterion is still that of the number of groups in a sequence of r, 1’s and r, 0°s. Hence 


we are concerned with finding the power function of a test based on the number of groups 
in the sequence. 


3. CALCULATION OF POWER FUNCTION 

he principles of a Markoff chain have been discussed in detail, for example, by Fréchet 
(1938), but the simple notation used here will be that of David (1947). If #, represents the 
sth event, which may either be a | or a 0, then under Hy, the hypothesis of inJependence, 


P{E,=\j=p, P{E,=0}= 9, where p+q=1. 
Under the alternate hypothesis H,, we have 
P{E,=1} =P, P{E,= = Q, 


P{E,=1 | k, hg 1} _ ie P{E,=0 E, “hoe 1} - Vp 
P{E,=1 | EK, 1=9} = Pr P{E,=0| E,_,=9} = Io» 
where P+Q=1 





> Py +4, =] and Pe “+ za = e 














' 





P. G. Moore 307 
Making the simplifying assumption (David, 1947, p. 337) that if nothing is known about the 
s—1 trials preceding the sth trial, then P{H,=1} = P and P{E,=0} = Q, we have that 
P2 = Pq,/Q or P = py/(pe+4,). Using these relations and taking, say, the case of an even 
number of groups, it is possible with the aid of the formulae for P{2t groups | rr, H,} given 
by David (1947, p. 337) to evaluate a conditional power function. However, laborious 
calculations are necessary to evaluate this, even for a small sequence, so that some form of 
approximation is required. We will find, separately for an even (2¢) or odd (2¢+ 1) number of 
groups, the forms of p(t|r,72,H)) and p(t|r,r.H,), where p( ) indicates a continuous prob- 
ability distribution taken as an approximation to the set of discrete probabilities denoted 
by P{ }. Then our ‘conditional’ critical region w, will be such that 

P{Eew,|1,7,H} <a, 


where E is the sample point. Since our alternate hypothesis is that there is positive dependence 
in the sequence, we are interested in establishing the significance of low values of t. Hence 
we want to find the greatest value of t, such that 


P{t<t, | Hy} <a. 
This is equivalent to finding the greatest value of t, satisfying 


t=t 


>» Pt| ryraH} <a. (1) 
t=1 
The power of the test, 7, will be given by 
n= P{Kew, |1,7.H,} 
= P{t<t, | H} 


t=ts 
= zu Pt | r, 7. H,}. (2) 


4, APPROXIMATION TO THE DISTRIBUTION 


(a) Even number of groups 


We will consider in the first place the case of an even number of groups. It is known (David, 
1947, p. 335) that if there are an even number of groups (2¢), ther. 


Pit | ryr2Hy} = "IGG, 


(3) 
Pit | rr, H,} = &’ nC, Te ar zat) : 
W2P1 


k and k’ being constants such that = P{2¢} = 1, where & denotes summation over all possible 
values oft. From Wald & Wolfowitz (1940) we have that as r, tends to infinity, r,/r, remaining 
finite, the distribution in (3) under H, tends to normality so that 


i 1 /t—r\* 


where 7 and o? are the mean and variance of ¢, and can be derived from (3). Under H, we 


shall have 1 1/t—r\2 Pod 
a7 ~exp| —-(——-) lexp| élog,4?= |; 
p(t | A) byes P| - o J Jexe|é et to 








308 Randomness in a sequence of two alternatives 


on completing the square for ¢ and making the integral over all possible values of t equal to 


unity, this gives 
p(t| H,) = ex exp| - aa (r+08 log, #2) ge (5) 
Jame 2 Pride 


Hence, t is again normally distributed with the same variance as before but with a new mean, 
7’, where 
Poh 
Tt’ =T+07lo (6) 
ae oT Pda 
From equations (1) and (4) we can obtain the critical region and from (2) and (5) the power. 
The values of r and o? may be obtained in a simple way following the method outlined for 
the complete distribution by Wald & Wolfowitz. We need the following three quantities: 


r-1 r,-1 r,-1 
= F"1-l¢' T,-1 = Ven-l¢'! r,—-l¢! = ¥ c21-l¢! r,-l¢' 
a isa ~ ° Co1 = C. I> B ue ; Co-1 . Co. laa ue , Cor . Co_1- 
c= c= c= 


It will be seen that a is the term independent of x in the expansion of (1 + x)"*~1 (1+ 1/x)2—, 
and hence is equal to "*":~®C,_,, while (#— a) is the term independent of x in the expansion 
of we 1) (1+2)"-*a(1+ 1/x)-!, and hence is equal to (r,—1)"*":-8C,_,. The expression 


*S 


Sic —1)?"—-1C,_,"-1C,_, is the term independent of x in the expansion of 


oon 


(ry — 2) (72-1) (Lay? (1 + Lay? 


and is thus equal to (r,;—1)(7,—1)"*"—4U, _,. Using the identity c* = (c—1)?+2c—1 we 
can evaluate y. From these values we obtain for 7 and a” 


+n?’ 


(r, — 1)? (r2—1)? 


= +r —2) (4% 3) (8) 





(6) Odd number of groups 


For the case of an odd number of groups the expressions concerned are not nearly as simple. 
We have, if there be 2¢+ 1 groups, 


Pit | ry7_Hy} = UIC I1G_, + 1G,_, 1C), 


cP Q (9) 
Pi8t\r.r,H,} = v(Pats) E mC re 10) $n", rs 4a], 
A | 1°2 b P14 Py t-1 qe 4 


land U’ being constants, such that the summation over all possible values of ¢ is unity. We 
can deduce the mean (A) and variance () of t under H, from (9) by similar methods to those 
employed above, giving 

(r:— 1)? +(r2-1)?_ re 


- 2 10) 
2 2 9? ( 
ee. MytrT,.—2 





(7, — 1)? (7172 172— 1) + (72-1)? (7172-1) — 1) 
aL. a—Te— 2-1 PF (nn : 
ee e+ 1— 2) (1, + %_— 3) (+ F3—T, — M9) -m (11) 








Ui 


al to 


we 


P. G. MoorE 309 
Again, following Wald & Wolfowitz, we deduce that ¢ will tend to be normally distributed 


as r, tends to infinity, r,/r, remaining finite, so that 


HH) = Toma? -3(5) | 
Under H, we have that 


as gp a 5 aie Pei 
vt | iit Taare, 7 20° e” (E+08 log, Pa) ‘|: (12) 


where £ and ,? are the mean and variance of 
d - ng is ee OF 1 + ery r—IC.. 


Using the same methods as before we obtain the expressions 
P 
> (7, — 12 +2 (rg — 1)? 
9% q2 « 
“tn? Q om) 
hs ee 
—r,(r, — 1) +—7,(r2—1) 
nm” q2 ** 





P(r,—1)? Q(r2,— 1)? 
nee —-4 (7, %3—1g— 1) +— 2 (rr - 7-1) 
is Pi_ "172 Ge 1% —£, (14) 


P Q 
—r,(r,— 1)+—7,(r,-1 
Pr, Ga ) qe a(72 ) 








5. EQUIVALENCE OF THE TESTS 

We desire to show that in large samples the cases of odd and even numbers of groups tend to 
equivalence in the sense that the distribution of ¢ in both cases becomes identical. As we have 
indicated that in both cases the distribution of ¢ tends to normality, it is only necessary to 
show that the means and variances agree. Table 3 shows the scheme of means and variances 
that we have obtained. Hence we require to show, for large values of r, and rg, that (i) A and 
£ tend to 7 and (ii) ~? and ? tend to a. We note that, in the particular case r, = r, = r, the 
expressions reduce to 


T=r+l), A=E= hr, oF = (r—1)?/4(2r—3), gx? = pi = r(r—2)/4(2r—3). 
2 2 I I 





























Table 3 
Mean of ¢ 
aS ee Variance of ¢, 
H, and H, 

Hy H, 
> . ‘ 2 Pedi . 
Even no. of groups (2¢) T 7’ = 7T+0* log, . o 

GaPi 

Odd no. of groups (2¢+ 1) A V= ftp? log. t = | ye 








When P = 0-5, € = A and 3 = 2 whatever r, and r,. In general we see, from (7) and (10), 
that as r, and r, get large 7 and A tend to a common value. Similarly, from (8) and (11), 
o? and x? tend to a common value. Under H,, however, the expressions are a little more 































310 Randomness in a sequence of two alternatives 


complicated. We see, from the forms of (7) and (13), that 7 and & both tend to r,r,/(7, + 72,— 2). 
From (8) and (14), together with (13), we see that o? and ;} both tend to 


rir3/(71 + 72— 2)? (71+ 72-3). 


It is difficult to compare numerically the approximate with the true results as there are so 
many variables, but we will take some specific cases. 


(i) P = 0-5, whence £ = Aand pi = p*. ry +17, = 200. 






























































Table 4 
| 
T g=A o* al 
r,=100 r,=100 50-5 50 12-4378 12-4365 | 
r=105 y= 95 50-1206 49-8763 12-3744 12-3719 
r,=110 r= 90 49-7437 49-5050 12-1853 12-1915 
r=115 r= 85 49-1156 48-8859 11-8733 11-8854 
(ii) P+0-5. ry +r, = 200. 
Table 5 
P=0-6 P=0-4 
r,=120, r,=80 r,=80, r,=120 
7 = 48-2362 A= 48-0188 o? = 11-4433 pw? = 11-4723 
Pr P2 4 Mi Pr Pr g Mi 
E : e: ae | et Suber ano 
0-7 0-45 48-0257 11-4679 0-5 0-3333 | 48-0238 11-4689 | 
| 0-75 0-375 48-0280 11-4701 0-6 0-2667 48-0273 11-4696 | 
0-8 0-3 48-0299 11-4709 0-7 0-2000 48-0299 11-4708 | 
0-85 0-225 48-0314 11-4766 | 0-8 0-1333 48-0319 11-4725 | 
| 0-9 0-15 48-0327 11-4770 0-9 0-0667 48-0335 | 11-4738 | 


























These tables show that the differences involved do not appear to be very large when the 
sequence is of some length. 


6. THE TOTAL-GROUP TEST 


We have obtained in the preceding sections the power function for even and odd numbers of 
groups separately in a sequence of r, 1’s and r, 0’s. We will now combine tise results to form 
what David (1947) has termed the conditional power function. That is, we will find the dis- 
tribution of 7’, the total number of groups, in a sequence of r, 1’s and r, 0’s under H,. We have 
shown that for an even (2¢) or odd (2¢+ 1) number of groups t tends to be normally distributed 
with mean 7’ and varianceo*. Now in the limit, i.c. as the sequence gets large, the probabilities 
of geting an even or odd number of groups will each tend to a half. This can be seen if we 


we 





s 


e so 





the 


's of 
orm 
dis- 
ave 
ited 
ities 
“we 


P. G. Moore 311 


consider the discrete set of probabilities graduated by a normal curve. Hence we can combine 
the two distributions to give a new mean, 9, and variance, v*, of the distribution of 7’, 


0 = X(2t) Pfeven t} + X(2t+ 1) P{odd #}, 
i where the summation is taken over all possible values of t. In the limit we will get 
0 = 2r'+}. 
Similarly, we know that o? = 2 (#) P{event}—71” 
= 22 (é) P{oddt}—7’. 


Thus py? = D(2t)* Pfevent} + X(2t+ 1)? P{odd #} — (27’ + 4)? 


2(0? + 72) + 2(0% + 7'2) + 27’ + $ — 47"? — 27’ —} 
= 407+}. 
Under H,, however, we may calculate the exact value of the mean and variance of 7' following 


the method used in § 4 and given by Wald & Wolfowitz (1940, p. 151). As the sequence gets 
| large 7' may be assumed to be normally distributed with 


2r,7 
Mean = —+ +1 (15) 

rtts 
2r,72(27) 72-171 —"2) 
(ry +72)? (ry +72—1) 





Variance = (16) 


7. THE HYPERGEOMETRIC DISTRIBUTION 
If we have ‘oups, then from (3), 
P{2t | ry 72H} = "71C_,71C_,/ °C, 


r,-1 


an expression proportional to a term in a hypergeometric series and also equal to the chance 
of the partition in a 2 x 2 table with fixed margins (Table 6). Hence, the mean of (r, —t) is 

















Table 6 
r,—t t-—1 r,-1 
ab est ab) willie ae 
r,—-1 r,—1 rt+r,—2 | 








(r, —1)?/(7, +r.— 2) or, mean ¢ = (r, 7, — 1)/(ry +72— 2), and the variance of (r, — #)and therefore 
of t is (r,— 1)? (r7,— 1)9/(7, +72 — 2)? (7, + 72-3), results which agree with those in equations 
(7) and (8). These are analogous to the expressions given by Stevens (1939) for the number of 
groups of 1’s and 2’s in a circular sequence, when there must be an even number of groups. 
For the case of an odd number of groups we may take the two terms of (9) separately, the 




















312 Randomness in a sequence of two alternatives 


first term, giving an expression proportional to the probability of 2¢+ 1 groups in case C and 
the second for case B. These expressions are also proportional to a term in a different hyper- 
geometric series and equal to the chances of the partitions in 2 x 2 tables with fixed margins 
(Tables 6 4, 8B). 

















Table 6a. Case B Table 68. Case C 
r,—t t r; r,—t—1 t-1 r,—2 
saute} itcedince eS |: ee 
r,—l r,—1 rrt+rT.—2 r,—1 r,—1 rtr.—2 
































The conditional power function obtained in the preceding sections is similar in many 
respects to that obtained by Patnaik (1948) in testing for the difference between two pro- 
portions when the data are put in the form of a 2 x 2 table. His conditional power function 
(pp. 162-3 of his paper) is the same as that derived here for t, remembering that he is using 
‘a’ in the notation of our Table | as his criterion. . 

The usual procedure in the past to test for randomness in a sequence of two alternatives 
has been to apply the y? test to Table 1. It is clear, however, that the underlying basis is 
the group test. The procedure now suggested is that we deduce 7' from Table |. Then our test 
would be: 


(i) Find r,, r, and 7 from the 2 x 2 table formed. 
(ii) Caleulate mean and variance from (15) and (16). 
(iii) From the normal curve tables find the probability of getting the observed value of 
T or less under H). 
(iv) Reject the hypothesis of independence if this probability is less than some pre- 
assigned level «. 


In large sequences the two tests may be shown to tend te equivalence, but how far the 
difference between the tests is important for small samples has not yet been fully investigated, 
although in a number of cases comparisons have been made. As an example, Table 7 gives the 
probabilities of having 7, groups or less in a sequence of 40 using the different methods o* 
evaluation. For the ‘normal’ group test, equations (15) and (16) have been utilized with 
a continuity correction and the normal curve tables. For x?, the probability in brackets is 
obtained with a continuity correction, the other probability being obtained without it. 
The last column gives the exact chances calculated from equations (3) and (9). 

It can be seen that the exact and approximate distribution of 7' are very similar, but with 
x® the use of a continuity correction appears to make all the probabilities too high. This 
discrepancy was noted in a number of other cases. It should also be noted that when 7’ is 
odd the probability ket obtained from x? is dependent on whether we have case B or case C. 
The differences betwe°:: the two might be important in cases where the probabilities were 
near the preassigned significance level. 





of 


eo ee a a 


\ 


P. G. Moore 


313 














Table 7. Sequence of 40. r, = 25, ry = 15 
x? 
r ‘ Normal fy Exact 
o groups ; | groups 
Cases A and D Case B Case C 
12 0-0066 0-0066 —_— — 0-0065 
(0-0163) 
13 0-0162 — 0-0144 0-0193 0-0164 
(0-0324) (0-0425) 
14 0-0362 0-0363 —_ — 0-0365 
(0-0733) 
15 0-0728 — 0-0656 0-0847 0-0738 
(0-1208) (0-1525) 
16 0-1330 0-1339 _— — 0-1329 
(0-2220) 
17 0-2206 — 0-2025 0-2489 0-2215 
(0-3105) (0-3707) 























The table gives P{T' < T4}. 


8. THE POWER FUNCTION; NUMERICAL COMPARISONS 


All the first set of calculations are based on a sequence of 40 units. The value of P used 
throughout is 0-5, the significance of which is mentioned in the next section and, as we are 
concerned with the case of positive dependence, p, is taken greater than P. 

Table 8 compares the values of the power function of the group test for different com- 
positions of a sequence of 40 units. These values were calculated from the normal curve 
approximation, using a continuity correction. Allowing for differences in the value of «, owing 


Table 8. Power functions for sequences of 40 (rz = 40—r,) 












































Value of r, 
Pi 
20 21 22 23 24 25 26 27 

0-5* 0-0274 0-0281 0-0315 0-0348 0:0424 0-0536 0-0150 0-0223 
0-55 0-0978 0-:0995 0-1068 0-1144 0-1306 0-1520 0-0541 0-0713 
0:6 0:2565 0-2591 0-2688 0-2811 0-3047 0-3328 09-1513 0-1807 
0°65 0-5051 0:5076 0-5150 0:5283 0-5510 0-5748 0-3335 0-3683 
0-7 0-7660 0-7674 0-7694 0-7783 0-7913 0-8030 0-5838 0-6103 
0:75 0:9346 0:9348 0-9340 0-9372 0-9409 0-9433 0-8207 0-8301 
0-8 0:9920 0:9920 0-9915 0-:9920 0-9923 0:9922 0-9579 0-9586 
0°85 0:9998 0-9998 0:9997 0-9997 0-9997 0-9997 0-9966 0-9963 
0-9 1-0 1-0 1-0 1-0 1-0 1-0 1-0 0-9$99 

K 0-1254 0-1194 0-1031 0-0807 0-0672 00366 0-0211 0-0109 





* When p, = 0-5, the value of the power function is the value of the first kind of error, usually denoted 


by a. 











314 Randomness in a sequence of two alternatives 


to the discontinuity, the values are very consistent, and the test appears to be just as powerful 
over the limited range considered, whether or not the sequence has equal numbers of 1’s 
and 0’s. x, given at the bottom of the table, is the probability, under the hypothesis Hp, that in 
a sequence of 40 units with P = 0-5 there will be just r, 1’s. We may obtain the overall power 
function by considering a sequence of fixed length, say 40. Then, assuming that r, is normally 
distributed with mean RP and variance RPQ(1 + 6)/(1—6), where d = p, —p, and R = 1r,+7, 
(Uspensky, 1937, p. 301 following Markoff), we can use Table 8, extended slightly beyond 
r, = 27, to get this overall power function. We do this by weighting the tabulated powers by 
the probability of obtaining that division of r, and r, and then summing for all possible 
divisions of 2. This would tell us that if we drew samples of 40 from a population for which 
P = 0-5 and 5 = 4, then, using this group test, the chance of detecting that d+ 0 is the 
power corresponding to p, = 0-5+ 46). These powers are given in Table 8 alongside those 
for a group test devised by Bateman. 


Table 9. Comparison of power functions r, +7, = 40 





Value of p, 





Test used ‘ (ne arg | 1 me a. 


0-5 0-55 0-6 | 0-65 0-7 0-75 0-8 | 0-85 0-9 





T’, as described above | 0-0329 | 0-1080 | 0-2664 | 0-5041 | 0-7512 | 0-9183 0-9854 | 0-9990 1-000 





Bateman’s 0-0266 | 0-0952 | 0-2484 | 0-4867 | 0-7397 | 0-9138 0-9846 | 0-9989 1-0000 






































Bateman (19486) hasshown that when P = 0-5 the total number of groups (7’) is distributed 
as the binomial (p,+ q,)"-'. Hence we may compare this test with the power function just 
computed. The slight differences are due to the fact that whereas Bateman’s test has the same 
critical region, whatever r,, our test has a critical region depending on r,. The agreement is, 
however, very good. But when P+0-5 Bateman’s formula no longer applies and our 
approximate test must be used. 

The difficulty in getting from Table 8 to Table 9 is that it entails very long calculations. 
As an approximation to the overall power we may follow a method used by Patnaik (1948). 
Briefly, we replace r, and r, by their expected values of RP and R(1— P). The adequacy of 
this approximation can be seen by comparing the first column in Table 8 with Table 9. The 
agreement is very good. 

Using the 0-05 nominal significance level for H,, we can find the minimum length of 
sequence such that the chance is 0-9 of rejecting randomness by the group test, when p, has 
values greater than 0-5. Table 10 gives these minimum lengths of sequence. They are obtained 
in the same way as before, successive values of r, and r, being taken until the power just 
becomes 0-9. The table shows that if we are interested in picking out the degree of dependence 
of a certain magnitude the length of sequence necessary to do this with a given power varies 
considerably according to the value of P. In practice, however, it is possible that we would 
want the test to be equally powerful for the same values of some function of 6 and P, and not 
just for equal values of é alone. 











| 
| 
| 
| 
| 





rful 
 l’s 
it in 


wer 
ally 
+1 
ond 
> by 
ible 
Lich 
the 
ose 





P. G. Mcorr 315 






































Table 10 
Value of A = p,— P= &(1-—P) 
| | 
0-05 | 0-10 O15 | 0-20 0-25 0-36 | 0-35 0-40 
| 
| | j | 
P=0-3 | 1800 | 466 20 | 126 | 92 70 6«|6©|ClC46 34 
P=0-4 | 1240 320 120 | an Ss +: S ~ 
| P=06 904 220 9% | 56 | 34 22 — ee 
| P=06 | 540 mm: | #2T ee) eS a le 
| P=07 | 124 63 | 34 | 2 } — ae eee = | 
| | | j 








9. APPLICATION OF RESULTS 


In a control chart an examination of runs of values above and below the median is often of 
assistance, in eliminating trend effects. Mosteller (1941) uses the length of the runs, while 
Shewhart (cited by Swed & Eisenhart, 1943) suggests the use of the technique we have 
discussed employing the number of runs. In cases of short sequences the power of the 
group test is low. But for longer sequences, say 40 or over, the technique provides a useful 
method for picking out clustering effects. It is assumed, from previous results, that the 
median value is known, whence it could be used to make P = 0-5. The foregoing test has been 
concerned solely with the case of positive dependence, i.e. p, > P, and if it were desired to 
test for dependence in both directions a two-ended test would have to be made on similar 
lines. 
Table 11 


Present month 














Wet Dry Total 
cesiieid beset eae aa & os 
| | 
Preceding Wet 542 | 582 1124 
month — | Dry 582 759 1341 
| 
dunk po te 
Total | 1124 | = 1341 2465 





Cochran (1938), in considering the distribution of wet and dry months, gives a 2 x 2 table. 
A wet month is defined as a month when rainfall is greater than the average for that month 
over the period 1881-1915. Hence we have a P of approximately 0-5. The alternative to 
randomness in the order in which wet and dry months follow cach other is that there is a 
persistence of one kind of weather giving us the basic conditions for a Markoif chain. Since 
we are considering either case B or C of §2, the value of T is 2 x 582+ 1 = 1165. Further, 
assuming that we have case C,* so that r, = 1124 and r, = 1342, we get that mean 7' = 1224-36 
and s.k. of 7’ = 24-63. Hence we refer the ratio (1224-36 — 1124)/24-63 = 2-41 to the normal 
curve tables and find that the probability of getting a value of 7 less than or equal to 1165, 


* To differentiate between cases B and C we would need to know whether the sequence started with 


a wet or a dry month. 








316 Randomness in a sequence of two alternatives 


under the hypothesis of independence, is 0-0080 or odds of 125 to 1 against independence. 
If a continuity correction is used the odds are 119 to 1 against independence. Thus we would 
reject the hypothesis H, in favour of H,, the hypothesis that there is positive dependence, 
in the sense that wet months tend to follow wet months and similarly for dry months. Cochran 
found the proportion of months, following a wet month, which were wet and also the pro- 
portion fallowing a dry month which were wet. These proportions are 0-4822 and 0-4340 
respectively. The s.&. of the difference is, from the binomial theorem, 


VEE AEE (ria + rden)} = + 0-02013, 


and from normal curve tables we get odds of 120 to 1 against the observed difference arising 
by chance in a sample from a population for which these two proportions are equal. This 
test is equivalent to calculating ,/x?, and if a continuity correction is employed the odds 
become 108 to 1 against independence. Thus using either test, in this case of a long sequence, 
we would reject the hypothesis of independence, although the difference due to the use of 
a continuity correction for y?, which was noticed for short sequences, is still apparent. 


10. SUMMARY 


We have obtained an approximate formula which makes it easy to find the power of the 
group test for a sequence of two alternatives when H, is the hypothe « that there is ran- 
domness in the sequence and whilst for H, the dependence follows a simple Markoff chain. 
This formula has been used to find the minimum lengths of sequence necessary to pick out 
various degrees of dependence with a given power. Finally, the x? test for independence in 
a 2x 2 table is shown to lead to the same test for long sequences. 


The author wishes to thank Dr F. N. David and Professor E. 8. Pearson for their 
guidance during the preparation of this paper. 


REFERENCES 


BATEMAN, G. I. (1948a). Biometrika, 35, 97. 

BaTEMAN, G. I. (19486). Unpublished thesis. 

Cocuran, W. G. (1938). Quart. J.R. Met. Soc. 64, 631. 

Davin, F. N. (1947). Biometrika, 34, 335. 

FRECHET, M. (1938). Recherches théoriques modernes sur le calcul des probabilités, Book 2. 
Moste._er, F. (1941). Ann. Math. Statist. 12, 228. 

Patnalk, P. B. (1948). Biometrika, 35, 157. 

STEVENS, W. L. (1939). Ann. Eugen., Lond., 9, 10. 

Swen, F. S. & Ersennart, (. (1943). Ann. Math. Statist. 14, 66. 

Uspensky, J. V. (1937). Introduction to Mathematical Probability. McGraw Hill Book Co. 
Wacp, A. & Wotrowrrz, J. (1940). Ann. Math. Statist. 11, 147. 





p 
a 
b 
h 
e 
I 
V 
( 
t 
] 
( 


wSeTeTlCCUhSOrrlCCt'Vv'' 


[ 317 ] 


A GENERAL DISTRIBUTION THEORY FOR A CLASS OF 
LIKELIHOOD CRITERIA 


By G. E. P. BOX 
Imperial Chemical Industries, Dyestuffs Division Headquarters, Blackley, Manchester 


1. INTRODUCTION 


The likelihood ratio method of Neyman & Pearson (1928) has been used by many different 
workers for the derivation of criteria appropriate for the testing of a large variety of hypo- 
theses. Plackett (1946), in a recent survey of literature on testing the equality of variances 
and covariances, lists, on this problem alone, criteria for the testing of no less than thirty-one 
hypotheses investigated at different times by workers in this field. Most of the criteria either 
have been or can be arrived at by the likelihood ratio method. In the preface to his survey 
Plackett says: ‘Generally speaking the difficulties in testing such hypotheses lie not so much 
in deriving criteria—but in finding their exact distributions when the hypotheses are true 
and determining the best critical region to adopt.’ 

Although in many cases the exact distributions cannot be obtained in a form which is of 
practical use, it is usually possible to obtain the moments, and these may be used to obtain 
approximations. In some cases, for instance, a suitable power of the likelihood statistic has 
been found to be distributed approximately in the type 1 form, and good approximations 
have been obtained by equating the moments of the likelihood statistic to this curve. For 
example, in the original paper on the L, test for homogeneity of variances, Neyman & 
Pearson (1931) suggested that the distribution could be approximately represented in this 
way, and later Bishop & Nair (1939) showed that the significance points obtained by Nayer 
(1936), using this method, were in excellent agreement with the true values. The fitting of 
the type I curve is simple once the moments are obtained, but these moments, being the 
products of I-functions, are usually rather troublesome to calculate. To overcome this 
difficulty, Bishop (1939), working on the distribution of the multivariate equivalent of the 
L, test (the test for constancy of variances and covariances in k p-variate samples), 
derived empirical expressions for the parameters of the appropriate type I curve, thus 
avoiding the troublesome intermediate step of calculating moments. Bishop mentions that 
Nair succeeded in finding similar expressions on a theoretical basis, and Tukey & Wilks (1946) 
give a more general theoretical method to find approximations of this kind. 

A different line of approach was adopted by Bartlett (1937). Neyman & Pearson had 
pointed out in their original paper that, if NV’ is the total sample size, — N’ log, L, would be 
asymptotically distributed as y*. From considerations of sufficiency Bartlett obtained what 
was in effect a modified form (which, following Hartley & Pearson (1946), we shall call M) 
of this logarithmic statistic. From the moments of the modified likelihood statistic he was 
able to develop a scale factor C, which was related to the effective sizes of samples and which 
approached the value unity as the sample sizes became large. The distribution of M/C was 
then very well represented by x? even when the samples were small. Bartlett later (1938) 
used the same method to obtain an approximation for the test of significance in multivariate 
analysis. In 1940 Hartley, starting from the moments of the modified likelihood statistic, 








318 A general distribution theory for a class of likelihood criteria 


obtained an asymptotic series of x? integrals for the logarithmic statistic M which agreed 
very closely with the exact distribution. In 1941 Waid & Brookner, investigating an entirely 
different problem, the distribution of Wilks’s statistic for testing independence of k sets of 
variates, again starting from the moments of the likelihood statistic A, eventually obtained 
an expression for the distribution of a logarithmic statistic (a negative multiple of log, A) 
in the form of an asymptotic x? series. This was later modified by Rao (1948) in the important 
special case of two groups, when it corresponds to the test of significance in multivariate 
analysis previously referred to. Neither Wald & Brookner nor Rao investigated the accuracy 
of these series. 

It is possible therefore to distinguish two definite lines of approach, which have been used 
in certain cases where the moments of the likelihood criteria are known but the exact dis- 
tributions are not. On the one hand the moments have been used to fit the Pearson-type curve. 
This usually gives an adequate approximation, but owing to the amount of labour involved 
in the calculation of the moments it would not be attractive for routine significance testing 
unless methods such as Bishop’s could be used to obtain the parameters of the fitted curve 
directly, or the results from the method could be tabled. On the other hand, the general 
expression for the moments of the likelihood statistic has been used in certain cases to obtain 
for the distribution of the logarithmic statistic M, a y? approximation and an asymptotic 
x? series. It will be the object of the present paper to investigate in some detail this second 
line of attack. 

The method will be investigated in particular for two general criteria: 


(1) The test of constancy of variance and covariance of k sets of p-variate samples. This 
includes, as an important special case when p = 1, the test for constancy of variance in k 
samples. 

(2) Wilks’s test for the independence of k sets of residuals, the Ith set having p, variates. 
When k = 2 this corresponds to the test of significance used in multivariate regression and 
analysis of variance and covariance, and when k = 2 and p, or 7, is unity, it gives the 
corresponding well-known univariate tests. In the latter case, of course, the exact distri- 
butions are known. 

We shall refer to these two criteria as generalized tests for homoscedasticity and independ- 
ence, respectively. The assumption of normality or multinormality for the distributions of 
the original observations will be made throughout this paper. 

Taking for our test function M, a negative multiple of the natural iogarithm of the like- 
lihood statistic (or some modification of it), we shall obtain in each case, 

(a) a series solution which, we shall demonstrate, agrees very closely with the exact 
distribution, 

(6) an approximate solution using a single y? distribution, 

(c) a rather better approximation using a single F distribution 


The accuracy of the various methods and the relation of the results to those of other 
workers will be discussed. 


2. THE GENERALIZED TEST OF HOMOSCEDASTICITY 

The univariate statistic. The L, statistic of Neyman & Pearson for testing the homogeneity 
of a set of variances, takes the form of the ratio of a weighted geometric mean of variances 
to a weighted arithmetic mean, where the weights are the sample numbers. Welch (1935, 


" rrr 
ee ET em 








193¢ 
were 
the 
fica 
dist 


Pea 





>< 


rr ee ee) 


— 2 








G. E. P. Box 319 


1936) generalized the test to cover the case when residuals from a fitted regression equation 
were tested for homoscedasticity, and derived the moments for a modified criterion in which 
the weights could have any values whatever. In 1936 Nayer tabled the approximate signi- 
ficance points for , in the cases of equal sample numbers by fitting type I curves to the 
distributions by the method of moments, as suggested in the original memoir by Neyman & 
Pearson. 

The statistic proposed by Bartlett (1937) which we shall cail M is given by 


M = Nlog,s— > vy, log, s;, 
i 


where s = (Zr,s,)/N, 


and s, is the usual unbiased estimate of the variance in the /th group, / = 1, 2, ..., &, based on 
sums of squares having v, degrees of freedom, and N = Xv). It was later shown (Brown, 1939; 
Pitman, 1939; Bishop& Nair, 1939) that this criterion is unbiased in the sense used by Nevman 
& Pearson (1936, 1938). Nair (1939) derived a series solution for the distribution of the like- 
lihood statistic in the case of equal sample numbers; his solution is very involved, but has 
been used as a standard to check approximations. Bishop & Nair (1939) used this series to 
check the accuracy of the type I approximation used in Nayer’s table. They also checked the 
Bartlett (1937) approximation and found that both methods were fairly good except when the 
degrees of freedom were small. In the case of unequal samples, however, Nayer’s tables were 
not available, and in view of the labour involved in the type I fit, the x? method of Bartlett’s 
was preferred. 

Hartley's (1940) asvmptotic series depended to the degree of approximation used, on two 
parameters ¢, avd cg which varied with the effective sample size and relative composition of 
the groups 


¢,=U——-=, ¢ = L-5-<5. 
1 yy N’ 3 M1 N38 


The first is related to Bartlett’s scale factor C, in fact 


Cy 


C ~'+o51) 


Tables were afterwards computed by Thompson & Merrington (1946) from Hartley’s for- 
mula, and comparisons were made with the values calculated by Bishop & Nair. 

The multivariate statistic. In the multiyxriate case Wilks (1932) derived the likelihood ratio 
test and obtained the moments of the criterion, which is a generalized form of that used in 
the univariate test, the determinants of estimated variances and covariances replacing the 
variances. Bishop (1939) took as his criterion /,, the 1/N’th power of the likelihood statistic, 
N’ being the total number of observations. He gave reasons for believing that this criterion 
could be approximately represented by a type I curve 


p(l,) = constant /7—1(1 —1,)™-? (1) 


by choosing the value of m, and mg so that the first two moments of the Pearson curve agreed 
with those of the criterion. His arguments were supported by the agreement found ina number 
of trials between the higher moments of the fitted type I curve and those of the criterion. 
Only in the case of two groups and either one or two variates was it possible to obtain a check 
against the exact distribution, but in these cases the agreement was very good. Unfortunately 
the labour involved in the calculation of the first two moments of the criterion was too 
Biometrika 36 ar 








320 A general distribution theory for a class of likelihood criteria 


great to allow this method to be recommended for routine use. Bishop therefore proceeded 
as follows: 
(a) For the case of equal sample sizes he obtained, empirically, expressions for m, and m, 


in terms of the number of observations n in each group, the number of variates p, and the 
number of groups k 


m, = k(n—p)—0-01(k—1) Reso aaah (2) 
Mm, = 0-25(k—1) p(p+ 1). 
(6) For unequal sample sizes he proposed approximating to —2N’ log,/, by means of 


a x? distribution using a scale factor G in a similar way to that adopted by Bartlett in the 
univariate case. 


He showed that — 2N’ log, 1, is approximately distributed as Gy”, where 


| eae : , ’ 
G=1+ 7 > = {0?/2n, + i/(m,— 1%) + m/3(m,— 2)*} 
=] i= 


— 5 (+i yan’ + (e+ i— lw’ —k+ 1) + NB" -k+ 1-9]. (3) 
i=1 


n, is the number of observations in the [th group, }n, = N’ and 4? is distributed with 
f = k—1) p(p +1) degrees of freedom. . 

We shall refer to these methods as Bishop’s methods (a) and (6). Bishop remarks that the 
scale factor G is rather troublesome to calculate unless n, = n. George (1945) was able to 
evaluate the exact distribution in a number of simple cases. She used her results to test the 
accuracy of Bishop’s approximations and found, in the cases she considered, that (6) was 
superior to (a). 

Plackett (1947) suggested that in view of the unsatisfactory position with regard to the 
distribution of this criterion that it might be better to abandon it in favour of an alternative 
test derived by him which had the advantage that at least when p = 1 or 2, and for certain 
other special cases, the exact distribution was known. Plackett’s test, however, has the 


disadvantages that the results depend on the particular arrangement of observations chosen, 
and that the samples must be of equal sizes. 


2-1. The present approach 


Suppose s,,, is the usual unbiased estimate of the variance or covariance A‘! between the 
ith and jth variable in the /th sample based on sums of squares and products having v, degrees 
of freedom, and suppose there are k such samples and s;; is the average variance or covariance 


(= 18) / N, where N = Sy,. We take as our criterion a generalized form of Bartlett’s 
< 1 
criterion 


M = N log, | s;; | — & (riloge | sis) (4) 
. 2 ; k | Ses | IN 
=—Nlog, Lj}, where Li = I] (>> (5) 
i=1 || 845 | 
When the degrees of freedom are equal, M and L; are related with Bishop’s |, as follows: 
M=-—2Nlog,l, and L,=h. (6) 


When the degrees of freedom are unequal, L; will differ from the likelihood statistic in 
weighting. When p = 1, M is the criterion derived by Bartlett and later used by Hartley. 











of 
he 


he 


es 
ce 


5) 


6) 





G. E. P. Box 321 


We proceed to obtain the moments of L; when the null hypothesis is true. If c;;, are the 
sums of squares and products ge on v, degrees of freedom corresponding to the s;;,, we have 
Cin = YSiq, Ci; = N8j;, 80 that c,; = Y ¢;;. The joint probability density of the c,,, for the Ith 


sample is given by the distribution iceieedil by Wishart (1928): 
P(Ciu» Cras -++5 Cppy) = K(M%) | Ce |? PY exp {— 4 ~ Ainciz}, (7) 
where {K(v,)}2 = 28?) 7tP@—-Y Tl prs 1) | A,|-™, (8) 


and A, is the matrix of the A;,,, the inverse of “ik matrix of the At. 
When the null hypothesis is true, A; is the same for each of the samples, A; = A. 
1 = 1,2,...,k, and the gth moment of | ¢,; | is 


fu [Kw | Czy [Hr p-) | C4; J exp (— j ¥ Aycu)| e431, C49; --- Uy nk (9) 
~ u 
it is also given by 

| K(N) | Ci; [8p -1) | Ciz |\9 exp| - 1EAyey| dc, depp «. dc, (10) 


Writing v,(1+ 2h) for v, on both sides of the identity and then taking g = — NA and in- 


tegrating over the whole space for which the matrices of the c,;, ¢,, are positive definite, we 

















ij? 
have {ti (Leen cul} "Yn Ee (1+ 2h)} BY =F ES (1+ 2h)} (11) 
\ Tea] = = K(») aati 
4 nt Sl -— K{N(1 + 2h)} k nf K(v}) 'N) i 
F 3 '\Nh — 6 p>—-Myh — — . oe. 
That is 6(L;) E&(e-™) K(N) K{v{1+ 2h)} =) a 
_ k N\e et regen 3) , Peed oh 
-HG) "Clmwiniol tie 


We have first proved equation (13) as an analytic identity for real A; it will, however, be 
generally valid in the range where the functions are analytic. We can thus obtain an expression 
for the characteristic function of pM, where p is a constant <1 at our choice, by replacing 
h by —itp in the above expression. The reason for introducing the constant p will appear 
later. Further, if we write N = vk (i.e. vis the average of the degrees of freedom) and define 
new quantities ~, “,, £, 8, by the relations 





= py, K=pv, v=ptf, y= Mth, (14) 
we obtain the characteristic function of pM in the form 
4 \ tte 7 7 Vv (af kT 1 — — 
o(t) = ni (Xe) l Di dfku+ kB —3j}) mELIZG 2it) +B, i), (15) 
i) Pike —2iy+ keen Pet AD 
and taking logarithms we have the cumulant generating function in the form 
F(t) = gt) —9(0), (16) 


. ka\  v—1f & : ' ’ 
where g(t) = — 3 ituplog(“)+"5 | 3 flog Patil — 250 + 6,—a10 


log PUha(1 — 21) +k8—j})]. (17) 


21-2 








322 A general distribution theory for a class of likelihood criteria 


and g(0) is a constant independent of ¢ obtained by putting ¢ = 0 in the above expression. 
Now Barnes (1899) was able to generalize Stirling’s theorem, and he showed that for all x, 
real or complex, log '(x +h) may be expanded in an asymptotic series: 


‘ n B,(h 
log '(x+h) = log ./(27)+(xa+h— )logaz—z— & ( Ap et), + Rael) (18) 


‘ ! 6 
where &,,(x) is a remainder term such that | R,,(x) | < <i] | @ is some constant independent of 


x and B,(h) is the Bernoulli polynomial of Sst r and order unity defined by 
waa * Pe B,(h). (19) 
Expanding each of the -functions in this manner we obtain 


¥() = Q—9(0)—Hk—1) p(y + 1) log (1—2it) + ll ~2it)7 + Ryle t), (20) 
where Q does not contain ¢ and is given by 














=: “ p+l +1 ku 
Q =e og on +? Pte (4 ew Pe = P| tog“ |. (21) 
kB-j 
(~ po ees sls k Brss( 2 2) Bess( 2 ] 
a, = > = (22) 
& wk 
and &,,,,(, t) is defined by (17) and ( 
From (20) we have 
@(t) = K(1—2it)- ¥ rab —2it)-" + RB’, 4(p, t) (23) 
e=0 


where A = exp{Q—g(0)}, f = 3(k—1) p(p+ 1), and a, is the coefficient of u-*in the expansion 
of exp!|5 a, |p" -| 
laa 


The probability density function of pM is then given by 
ST hia 
MN = -—- —itpM 2 
PipM) = 5] etme dt (24) 
=K z3 o PAXG+20) + Ri sale, t). (25) 
The probability that a given value M, of the criterion is exceeded is therefore 
Pr.{M > M,} = K Bs oP ft + RY (4, 8), (26) 


where Pron -| de P(XF+20) dX?, (27) 
pM. 


ner 


RY (mt) = az. ee e-ileM De Ae (1 — 2it)-#0 +20) (envi. — 1) dtd(pM) 
pM, 


and p(x7+2,) is the probability density function of the x’ distribution with f+ 2v degrees of 
freedom. For all sufficiently large values of yw, R,”,,(u,t) tends to zero and the required 














Pr 


of 


& 


3) 











G. E. P. Box 323 


probability will be given with sufficient accuracy by taking a suitable number of terms of 
the series in (26). Putting t = 0 in (20) we have 


Q—90) = —| 3 (ala) + Basalt 0}. (28) 


It is found in practice that by taking a few terms of the series (even in difficult cases usually 
not more than six), exp (— Za,/,) is so close to exp {Q—g(0)} = K that direct calculation of 
that constant is unnecessary. 

If we expand Q—g(0) as well as g(t) we obtain instead of (20) 


¥(t) = —4flog (1—2i) + ra ~ 2) #1) 4B, t)—Rya, 0). (29) 
Proceeding as before we obtain 


Pr. {M > Mg} = Py+o4(Py42— Fy) U/m 


2 
ay 
2! 


(B42 rat Bi 1/n? + ete. (30) 
Thus we may use a suitable number of terms of either of the series given in (26) or (30) to 
obtain the probability of the criterion exceeding a given value. Formula (26) has becn used 
in this paper, (30) being rather unwieldy if a large number of terms have to be taken. 

It should be noted that in the derivation we have used two series, first the asymptotic 
series for the expansion of the ’-functions, and then the exponential series. In any particular 
case we have to decide how many terms we need in the asymptotic series to give a sufficiently 
close representation of the function and then how many terms we shall use in the exponential 
series. In those cases investigated here, six terms of the exponential series have nearly 
always proved adequate as judged by the closeness of agreement between Xz,/u" and g(0)— Q 
independently calculated; often fewer terms were necessary, terms in higher powers of 1/j 


+ [2 o(Py.4—P,) + 


having negligible effect. In the case of the exponential series the number of terms necessary 


n 
to represent adequately exp ¥ a,/x” is usually not more than eight, but has sometimes been 
r 


as many as fourteen. It is mainly in order to keep the number of terms required at this stage 
within manageable limits that the scale factor p is introduced, since by suitable choice of 
this constant, the values of the a’s can be kept small and the number of terms required in the 
exponential series is consequently less. We see that in effect we are fitting a y* series to the 
statistic M by arranging that, to the order of accuracy chosen in the asymptotic serics, the 
series will have all its curmulants identical with those of M. Before we consider the problem 
of choosing a suitable value for p, we shall derive an expression for the a’s and hence the @’s 
in a form which is more suitable for computation. 


2-2. Determination of the a’s in a form suitable for calculation 


From the well-known properties of Bernoulli polynomials (see, for example, Milne- 
Thomson, 1933), we may write the symbolic equality 


B(x+y)+(B+x+yy, (31) 


where, after expansion, each index of B is to be replaced by the corresponding suffix. Whence 


B(“52) = 3 (’ : ') (4) Bf A. a 








324 A general distribution theory for a class of likelihood criteria 


Also if P(x) is a polynomial in x and P’(z) is the differential coefficient of P(x) with respect 
to x, 








p-l 
= Pia) + P(B+p)— P(B). (33) 
Thus if P'(j) = B(=}). 
P(j) = _2 p be nstant 34 
ae § e+1| 9) + constant, (34) 
pet 2 (-j —2 B+p) B 
. J\. —* _PtP\_ _ >| . 
and >, BI 2 jes 1 | Boal 2 Bas} 2] "i (35) 
If we denote the expression in square brackets by 6,, we obtain from (32) and (35) 
p-l k Bi-Jj k r+1 r+l1 l B, r+l-—s 
S$ (deo 8 8 (2 ys, e 
joo 1m ral 2 1=1 4 s Js+1\2 , _ 
and in a similar way we find 
p-1 kp -j r+1 p+ | ‘p r+l--s 
»> —|=-2> = . 37 
2B . ZI s }s+l (5) °. (67) 


Whence from (22) we obtain «, as a polynomial of degree r in # 





—_1\"k r+1 9 
where D, = 8,Y-; 
8% Byy|-=5? | - B,-5I, (39) 
and Ys = iz (*) — (40) 
It is interesting to note the relation between these quantities and thec’s defined by Hartley; if 
2. a 
c, = a Ne Ye = kv*I¢,_}. (41) 
In the special case when the samples have equal degrees of freedom 
Y, = 1-1/k. 
The values for 6, for s = 0,1,...,7, found from equation (39), are given below: 
8 6, 
0 — 2P, 
1 ip(p + 1), 
- — {eP(2p* + 3p — 1), 
3 reP(P—1)(p+1)(p+2), (42) 
4 — zbzp(Gp* + 15p* — 10p? — 30p + 3), 
5 reeP(p-1)(pt1)(p+2) (2p? + 2p—7), 
6 — eaP(Gp* + 21p5— 21 pt — 105p3 + 21p? + 147p— 5) 
7 76s?(P— 1)(p+1) (pt 2) (3p* + 6p — 23p? — 26p + 62), 





en IE 


oe 








k G. E. P. Box 325 





























ct and the values for the first six a’s from equation (38) are 
3) a, = — 3k{3D, 2 + 2D3}, 
t= I1tki{3D,f?+4D,2+2D;}, 
a = — jk{5D, 3+ 10D, A? + 10D,2 + 4D,}, par 
t= gok(15D,/*+ 40D, 63 + 60D, A? + 48D, 8 + 16D}, 
4) Xs = —osh{21D, f° + 70D, 4 + 140D, 6? + 168D, 6? + 112D;8 + 32D,}, 
te = gyh{7D,P%+ 28D, 5 + 70D, f* + 112D, 3 + 112D, 82+ 64D, 8 + 16D.. 
5) 
Vr 
¥ 
6) P 
? 
- 
5 
7) 4 
3 
2 
1 
8) o 
-1 
-2 
3 
’ - 
») ye 
0-08 
0-07 
)) 006 
0-05 
if 0-04 
003 
') 0-02 
001 
0 
) ¢ 
-0-002 





Fig. 1 








326 A general distribution theory for a class of likelihood criteria 


2-3. Choice of the value of p 


In order that the series should be of practical utility, it must be possible to represent 
exp {La, u-"(1 — 2it)-"} adequately by a reasonable number of terms of the exponential series; 
this can be done only if the coefficients a are fairly small. In the univariate case, these 
coefficients will be small even if p = 1, and in fact if we put p = 1 and p = 1 in equation (26) 
the series we obtain corresponds exactly with that found by Hartley (1940) using rather a 
different method of approach. The accuracy of Hartley’s series, using only three terms in the 
asymptotic series, was demonstrated (Hartley & Pearson, 1946) by comparison with the 
significance levels obtained from Nair’s exact expansion; the agreement obtained was good 
even when the degrees of freedom were as low as three. In the multivariate case, a much more 
satisfactory series can be obtained if p is less than unity. 

A typical set of curves showing the values of «,/j4, &/, ..., ag/u®, and the closeness of 
agreement between @—g(0) and — > a,/“” for varying values of p, are plotted in the figure 

r=1 
for the case p = 5, k = 5, v = 9. The curves all have minima or cross the zero line between 
p = 0-7 and p = 0:8. The value of p which makes a, zero is p = 0-76296. 

In the calculations carried out here, p was chosen so that «, = 0, since this not only resulted 
in the other coefficients being small, but the absence of «, made the calculation of the a’s 
much easier. Putting «, = 0 we obtain 

(2p?+3p—1)/,1 1 
it Ses en aH): “) 
2-4. Example of a calculation using the series 

To check his two working approximations (a) and (b), Bishop used as a standard of reference 

the values obtained by exact fitting of type I curves to the first two moments of the criterion 


l,. In the case p = 4, k = 5, vy = 9 Bishop found for the 5 % point a value corresponding to 
M = 70-281. 


To obtain from the series the probability associated with this value, we calculate 


f= 40, p= 0-808,889, pM = 56-849,5, yw = py = 7-28. 


r (=v) ,/ pe" a,/p” Py4.20 

0 — 1-000,000 0-040,742 
| 0-000,000 0-000,000 — 
2 0- 143,702 0- 143,702 0-092,597 
3 0-003,675 0-003,675 0-131,138 
4 0-001,793 0-012,118 0-178,763 
5 0-000,094 0-000,622 )-235,161 
6 0-000,032 0-000,791 0-304,909 
7 —_ 0-000,059 0-369,563 
8 — 0-000,044 0-446,178 

6 8 

x a,/e" 0-149,296 La,/e?1-161,011 

i 1 

6 


g(0)—Q@ 0-149,305 exp{Xa,/u’} 1-161,016 
1 


Difference 0-000,009 Difference 0-000,005 


6 
K = exp {Q —g(0)} = 0-861,306, exp} -¥2, | w" = 0-861,314 
1 


\ 


Pr. {M > 70-281} = KX{a,/u"} P,,o, = 00492. 





T 
di 
Pp 





G. E. P. Box 327 


To illustrate the accuracy with which the asymptotic series represents the function, indepen- 
dent calculations of g(0)— Q have been made. As has already been indicated, however, in 
practice this rather laborious calculation would not be necessary, K being taken as 

exp {— Za,/y’}. 


. 


2-5. Some comparisons between the series and the exact distribution 

For the cases p = 1,4 = 2 and p = 2, k = 2, the exact distribution is known for all values 
of v; for p = 1 the criterion will simply be a function of the variance ratio, and when p = 2 
the exact distribution has been found by Pearson & Wilks (1933). Table 1 enables the pro- 
babilities obtained from these exact distributions to be compared with those found using the 
series with scale factor p and up to four terms in the asymptotic and exponential series, 
higher terms having negligible effect. The table shows the values of M corresponding to the 
5% and 1°% points obtained by Bishop by fitting a type I curve to the first two moments of 1. 
The exact probabilities corresponding to these points and those obtained using the series 
are shown below the values of M. 























} | 
| v=9 y= 27 | y=79 
| | | | 
Set diag aretha mee! AB cl | i --.nlpaieonorstnnined Mnicael Peckesiess mca 
| | | 
| op ee ‘ | - 
| p=l | 5% point (type I) } 4-0499 3°9042 3-8794 
=. Probability : exact | 0-05005 0-05009 0-05002 
series 0-05005 0-05009 0-05003 
| | | ; 
} a es 
| | 1% point (type 1) 6-9902 6-7461 6-6991 
| Probability: exact 0-00998 | 0-01001 0-01002 
| | series 0-00998 0-01001 0-0 1002 
| rs 
| a | ne oo - = | = = = - as 
p=2 5% point (type I) 8-880] 81191 8-0018 
k=2 | Probability: exact 0-05005 0-04997 0-04979 
| series | 005005 =| 004997 0-04979 | 
| | 
ee ieee: feveces ane 4 
| 
1% point (type I) 12-8969 | 1k-7844 11-6074 
Probability: cxact 0-00999 0-01000 0-00997 
series 0-00999 | 0-01000 0-00997 





The agreement between the series and the exact values is remarkably good, the series 
giving five-decimal accuracy in almost every case tested. The more difficult cases, however, 
are those where p and & are larger, especially when v is small. For these, the closeness with 
which La,/” approaches g(0)—Q and the adequacy, when p is suitably chosen, of the 
exponential series as judged by the comparison of exp {X(e,/")} and L(a,/e"), support 
belief in the accuracy of this solution. For example, the case p = 4, & = 5 which we have used 
to illustrate the calculation of probabilities from the series, is not a particularly favourable 
one. It appears, however, that six terms of the asymptotic series and eight of the exponential 
series will be adequate; in less severe cases of course fewer terms are necessary. Further 
evidence is supplied later for the accuracy of this type of solution, for in tests of independence 
to be discussed in § 6, exact distributions are available for comparison, in cases where the 
series is not favoured, and excelient agreement is found. 








328 A general distribution theory for a class of likelihood criteria 


APPROXIMATIONS 


The series we have found is of rather too complicated a character for routine use; as an 
alternative, approximations were sought which were relatively simple. 


3. APPROXIMATIONS USING A SINGLE x”? DISTRIBUTION 


We have for the cumulant generating function of M (putting p = 1 in equation (20)) 


V(t) = Q-—g(0)- Frog (1 —2z)+ > Oe (4 — 2it)~, (45) 
r=1 y 


where f = }p(p+1)(k—1) and a; is obtained by putting p = 1 (i.e. 2 = 0) in equation (43). 
Expanding this expression in powers of ¢ we obtain 





5 ates u| (ir i 
Yo= ¥ Sarag— ys ECT) Se. (46) 
The jth cumulant of M is then given by 
K; = 2i-4(j-1) TiteaZe th Mi A+... (47) 
2ra}, 
where A, = vf’ (48) 


and in particular for the generalized test for homoscedasticity which we are considering, 
__2p®+3p-1 Ly 
1 6(k—-1)(p+ I) \Ay, NY)’ 


(p—1)(p+2) 3 
Aa ae-N) (35 1; -»)- 





(49) 


3-1. The choice of a scale factor in the y? approximation 


Now 2/-1(j—1)!f is the jth cumulant of yx? with f degrees of freedom. Thus, to order 
v-}, (47) is nse with the jth cumulant of Cx’, where C is either 1+ A, or (l—A,)-1. If 
A, were zero then C = 1+ A, would give the first cumulant «, to order y-* and the remaining 
cumulants would clearly be less in error than if C were taken as (1—A,)-!. However, if 
A, = Aj, it would be preferable to put C = (1—A,)-', since here this form would give agree- 
ment to order y~*. 

Clearly this would also be the better form to use if A, were near to or greater than A}. 
In the univariate case A, = 0 and C should therefore be taken as 


1 aad 
C=1+4A,= = 1455 -y(25-y) 


as has been shown by Bartlett (1937). 
For the generalized test for homoscedasticity we find 


k \? 2 k 
A, 44 = (= 7 Sear Aa tol. 8 (p+2)(% 7-8 fe 3p-1)}, te 





if 


I 
c 
f 


™/™vo we (Ue 


G. E. P. Box 329 


where y, is defined by equation (40). For p = 1, A, = 0, and consequently this quantity is 
negative for all values of k. When p> 1 it is positive, except in the particular case when 
p = 2and k = 2 and the v’s are equal, when the quantity in curled brackets is equal to — 1, 
and A? is almost exactly equal to A,; if the v’s are not equal, this quantity is greater than 
— 1, and it is positive for all larger values of p and k. 

For the multivariate statistic, p>1; we therefore take M/C to be approximately 
distributed as x? with f = }(k—1) (p+ 1) degrees of freedom and 


iy ke ae - Per (1-5): 
g= OA) = I~ eoe ne) N° 


=1 1 1 
if the degrees of freedom are equal this becomes 


1 | (2p?+3p—1)(k+1) 
"ating 6(p+1) kv : (51) 





We note that 1/C is the same as the value p chosen as scale factor (44) in the series solution. 
In the case of samples with equal degrees of freedom, the statistic M is equivalent to Bishop’s 
criterion l,, so that the multivariate scale factor C proposed here is comparable with the scale 
factor G proposed by Bishop and given in equation (3). Table 2 shows a number of comparisons 
for the significance levels, together with the values for the probabilities given by the series. 


Table 2. x? approximation ; comparisons of scale factors. Significance points 
for M with probability given by series 




















| | 
p=2 p=4 p=6 | 
k=5 5% Bishop (6) 23-06 0-0531 67-38 0-0742 142-19 0-1633 
y=9 Box | 23-27 00-0503 68-93 0-0597 148-30 0-1041 
1% Bishop (6) 28-82 0-0107 77-01 0-0173 156-35 0-0533 
Box 29-01 0-0101 78-74 0-0135 163-16 0-0286 
k= 5% Bishop (6) 22-13 0-0486 61-36 0-0511 121-29 0-0660 
v=19 Box 22-03 0-0501 61-31 0-0515 122-82 0-0556 
- 

1% Bishop (6) 27-34 0-0104 69-95 0-0106 133-60 0-0144 
Box 27-47 0-0100 70-03 0-0105 135-130-0116 





























It appears that, not only is the factor suggested here very much simpler than Bishop’s, 
but that it also gives a better approximation. However, it appears that even with the scale 
factor C this approximation fails when p is large and v is small. 


4. APPROXIMATIONS USING THE F DISTRIBUTION 


The x? approximation becomes less and less satisfactory as p and k are made larger and v is 
made smaller. We know, however, that for all finite p and k, M/C will tend to a type ITI 
curve as v becomes large. When v is not large we might expect the point corresponding to the 








330 A general distribution theory for a class of likelihood criteria 


distribution of M in the £,, £, plane to lie near the type ITI line, in either the type I or type VI 
regions. We shall see that the use of these curves rather than the type ITI will enable us to 
absorb a further term in the cumulant series, corresponding to the extra adjustable para- 
meter available with type I and type VI curves, and thus ensure agreement in the cumulants 
to order v-*, Although percentage points of the B-function have been tabled (Thompson, 
1941), tables of the function F are usually more readily available. For this reason results 
which occur in the B-function form will be inverted, so that only tables of the F distribution 


will be required in using these approximations, and they will be referred to as F approxi- 
mations. 


4-1. Choice of relevant type of curve 


The ‘start’ of the probability density function for M is at zero. For the Pearson system of 
frequency curves in which the restriction is made that the start of the curve is at zero, the 
relation between the cumulants 

—— = 27 (52) 
corresponds with Pearson’s type III curve when 7 = 1. If 7 slightly exceeds unity the curve 
falls in the type VI region, if it is slightly less than unity it falls in the type I region. Substi- 
tuting the values for the cumulants of the criterion M, using equation (47), we obtain, 
ignoring terms of order v~°, 

1+4A,+7A,+3A} 


"Pea, 464,444? 








(53) 
Thus for all sufficiently large values of v the region into which the curve will fall is given by 
A,> Aj A, = Ai A, < A? 
>i T=1 T<1 (54) 
Type VI Type Ill Type Ly 
For example, from equation (50) obtained in the case of the generalized test for homo- 
scedasticity, it is clear that for p = 1 the curve will be in the type I region, and for nearly 
all other cases, when p is greater than 1, it will be in the type VI region. 
4:2. Type VI 
The F distribution with 2P and 2Q degrees of freedom is defined by 
p(F) = constant FP-\| PF + Q)-+®, (55) 
The rth moment of a quantity bF, where 6 is a constant, is given by 
nan = (oy BRD o 
from which, after some algebraic reduction, we obtain the first four cumulants of bF as 
K,(6F) = P(b/P)(1-—1/Q)? 
k(bF) = P(b/P)?(1+(P—1)/Q) (1—1/Q)-2 (1 —-2/Q), 
k,(bF) = 2P(b/P)3 (1+ (P—1)/Q) (1+ (2P—1)/Q) (1 — 1/Q) 3 (1 — 2/Q). (1 — 3/Q) 
K (OF) = 6P(b/P)*{(P/Q)?(5/Q— 11/Q2) + (1 — 1/Q)? (1-3/Q + 2/Q2+ 6P/Q—13P/ 0%) 
(1—1/Q) ea pagina 1(1—4/Q)-2. (57) 








N 








G. E. P. Box 331 


Now we have seen that M is approximately distributed as.Cy?, so that if 7 is greater than 
unity we would expect to be able to find values b, P and Q, so that bF would be an even better 
approximation. Since we already know that the distribution is close to type ITI, we would 
further expect that Q will be large compared with P since this will be so for type VI curves 
close to the type IIT line. 

If then we ignore terms of order (P/Q)?, we find 


K,(bF) = P(b/P) {1 + 1/Q}, 
K,(bF) = P(b/P {1+ P/Q+ nt 
k,(bF) = 2P(b/P)3{1+3P/Q+ 6/Q}, 
k,(bF) = 6P(b/P)*{1+ 6P/Q + 10/Q}. 


(58) 





Now put 2P=f,=f, 2Q=-f,= ft and b= mit: 
then we obtain approximately 
K,(6F)= f{l+ A,+ Ag}, 
K,(bF) = 2f{1+2A,+ 3A3},| (59) 
K,(oF) = 8f{1+3A,+ 6A,}, 
x,(bF) = 48f{1+4A,+ 10A,}, 


which are identical to order v-* with the cumulants of M given by equation (47). Thus M/b 
will be distributed approximately as F with f, and f, degrees of freedom, where 


ft A 
fi=fh fhe= 4,4 b= hah (60) 


4:3. Type I 
We define a quantity X distributed in a type I form with parameters P and Q, 
p(X) = constant XP-(1 — X)@-1, (61) 
The rth moment of 6X, where b is a constant, is given by 


bX) = preiP tr (P+ @) 
HDX) = OF P+ Qtr)’ 


from which we find the first four cumulants of bX to be 
K(X) = P(b/Q) (1+ P/Q), 
K,(bX) = Seo 1+ P/Q) ?(14+(P+4+1) i , 
K(X) = 2P(b/Q)* (1 — P/Q) (1+ P/Q) (1+ (P+1)/Q) 1 (1+ (P+2)/Q)7, 7 (63) 
K(bX) = 6P(b/Q)* {1+ 1/Q-—2P/Q—4P/Q?— 2P?/Q? + P?/Q5 + P3/Q°} 
x (1+ P/Q)-4# (14+ (P+ 1)/Q)-? (14+ (P+ 2)/Q)7? (14+ (P +3)/Q)7. 


As before if Q is large compared with P, so that terms of order (P/Q)? may be ignored, 
we obtain 














(62) 





x,(bX) = P(b/Q) {1— P/Q}, 
K(bX) = P(b/Q)*{1-3P/Q+1/Q}, 
k3(bX) = 2P(b/Q)5{1 — 6 P/Q —3/Q}, 
k (bX) = 6P(0/Q)*{1 — 10P/Q—6/Q}, 


(64) 








332 A general distribution theory for a class of likelihood criteria 
# 1+2 ae fe 
—A, wanliens 1—A,+2/f, 


we again obtain approximately the values given in (59) which to the order of approxi- 
mation v~* are the cumulants of M. 


Thus M/b will be distributed as X in expression (61) with 2P = f, and 2Q = f, and 





and putting 2P=f,=f, 20=f,= 





ate taker “Wate ; 
f= f f= 1 -A, ) b = 1A, +277," (65) 
M 


Alternatively, j oer MN) will be distributed as F with f, and f, degrees of freedom. 
1 


We note that although M can vary from 0 to oo, bX can vary only between the limits 
0 and b, so that we are fitting a curve with limited range to one with infinite range. In practice, 
however, this presents no difficulty (see, for example, the comparisons of Tables 3, 4 and 5), 
for since the distribution of M will be near to type ITI, f, will be large compared with f,; 
consequently b will be large compared with f,. The mean for such curves will be approxi- 
mately equal to f,, so that the range will be large compared with the mean, and the part of 
the curve ignored by the truncation will be negligible. 


4-4. Application of the F approximation in tests of homoscedasticity 
From (50) we know that when p = 1, A, — A? is negative, and hence the type I form of the 
approximation is appropriate. When p > 2 we have seen that, except for the case p = 2, k = 2, 
when to this degree of approximation the curve is almost type III, A,— A? is positive and 
the type VI form is appropriate. 


4-41. Univariate test (p = 1) 


When p = 1, A, is zero, so that to carry out the test we calculate in turn 





l ee k+l Sa 
ae | | ed ins Ds oe Seen 
* a0 Y, x) fh=(h-l), fk=—ay, ? 4,4 2/f, 


M 
and re Tr fe i) to tables of the F distribution with f, and f, degrees of freedom. 
i( a 











In the special case when the degrees of freedom are equal 


A,==tt. (67) 
To test the accuracy of the approximation we will compare the values it gives for the 5% and 
1 % points of M, with those obtained from (1) Bartlett’s approximation, (2) Bishop & Nair’s 
(1939) values and (3) the x? series given by Hartley and corresponding to equation (26) with 
p = land p = 1. Tables 3 and 4 are adapted from those given by Pearson * Hartley (1946) 
with the value of Bartlett’s approximation and the present approximation added. In 
Table 3 a number of comparisons are made for the special case where the degrees of freedom 
are equal, and Table 4 shows a few comparisons for the case of five estimates of variance with 
unequal degrees of freedom. 

If the accuracy is judged by the closeness of agreement with the values obtained by Bishop 
& Nair, it appears that the F approximation is an improvement upon that suggested by 





i- 


at) 


d 




















G. E. P. Box 333 
Table 3. Comparison of approximations. Significance points 
for M (equal degrees of freedom, p = 1) 
5% 1% 
k v 
Bartlett Box Hartley | Bishop | Bartlett Box Hartley | Bishop 
(x?) (F) (series) & Nair (x?) (F) (series) & Nair 
3 2 7-32 7-20 7-05 7-11* 11-26 10-85 10-57 10-74* 
3 6-88 6-83 6-79 6-80T 10-57 10-41 10-32 10-43t 
4 6-66 6-63 6-61 6-62* 10-23 10-14 10-10 10-13* 
9 6-28 6-29 6-28 6-30T 9-67 9-64 9-64 9-67T 
5 2 11-39 11-23 11-01 11-09* 15-93 15-52 15°15 15-32* 
3 10-75 10-69 10-62 10-67+ 15-05 14-88 14-76 14-91f 
4 10-44 10-39 10-37 10-38* 14-60 14-42 14-46 14-47* 
9 9-91 9-89 9-90 9-93t 13-87 13-85 13-84 13-86T 
10 2 20-02 | 19-68 19-45 19-62* 25-58 25-22 24-65 24-90* 
3 18-99 18-91 18-79 18-827 24-31 24-12 23-97 24-09F 
4 18-47 18-41 18-38 18-42* 23-65 23-54 23-49 23-34* 
9 | 17-61 | 17-61 17-60 17-64} 22-69 22-58 22-53 22-48t 
i | l 





























* Calculated from Nair’s exact distribution. 


Table 4. Comparison of approximations. Significance points for M 


(unequal degrees of freedom, p = 1, k = 5) 


t Calculated by fitting type I curve to Z,. 





















































5% 1% 
N Vy Ve Vs | My | Us 
Bartlett} Box Hartley | Bishop | Bartlett} Box | Hartley} Bishop 
(x?) (F) (series) | & Nair (x?) (F) (series) | & Nair 
20 6 6 a 2 2 10-70 10-65 10-54 10-59 14-97 14-82 14-62 14-80 | 
45 16 16 9 2 2 10-45 10-40 10-30 10-35 14-62 14-51 14-31 14-46 
20 5 5 4 3 3 10-49 10-46 10-41 10-43 14-68 14-58 14-51 14-59 
45 14 14 9 4 4 10-07 10-05 10-04 10-05 14-09 14-05 14-03 14-05 











Bartlett and is about as accurate as Hartley’s series, whilst it requires no special tables and 
involves only simple calculations. 

Since the approximations proposed by Bartlett, Hartley and the present author are 
essentially asymptotic, it is to be expected that for small values of v, and particularly when 
v = 1, the approximations will break down. This does in fact happen to a certain extent with 
all of them, but it seems least serious with the present F approximation; for example, when 
k = 4, v = 1, we have 


Approximation 5% point 1% point 
Bartlett (x*) 11-1 16-1 
Hartley (series) 9-0 11-8 
Box (F) 10-3 14-6 


Nair’s expansion 10-0 14-1 








334 A general distribution theory for a class of likelihood criteria 


For the case vy = 1, Table 5 compares, for a number of values of k, the 5% and 1 % levels 


given by Bartlett’s approximation and by the present method with values obtained by 
Bishop & Nair (1939) using Nair’s expansion. 


Table 5. Comparison of the approximations when v = 1. Significance points for M 





if 


Value of k 2 3 4 5 6 7 8 9 10 








5% point ; Bartlett (?) 58 
Box (F) 51 
Nair’s expansion 51 





1% point | Bartlett (x?) 10-0 | 13-3 | 16:1 | 18-6 | 21-0 | 23-2 | 25-4 | 27-5 | 29-6 
Box (F) 7-9 11-3 14-6 17-1 19-2 21-5 23-7 25-8 27-9 
Nair’s expansion 8-3 | 11-5 14:0 | 16-5 18-9 | 21-0 | 23-1 | 25-2 | 27-2 









































4:42. Multivariate test p> 2 


To carry out the test we calculate the quantities 


On3 1 an — pe 9 
Ay = ge | +) A = PING (y 5 - 7) 








6(k—-1)(pt1)\"y NP? We nas ME MY (68) 
fi+2 1 
on 2 ee A ae = whe 
f= a(A 1) p(p +1), fe A, — A?’ b 1-A,—fi/fe’ 
and refer M/b to the tables of the F distribution with f, and f, degrees of freedom. 
When the degrees of freedom are equal 
(p? + 3p—1)(k+1) gy (Rt k+)) 

= = - 2): 69 
A, 6(p +1) kv J A, (p 1)(p+ ) 6k2p2 ( ) 


George (1945) was able to evaluate the exact distribution of the generalized L, statistic 
in simple cases, although, when the value of p and k are not very small, the method becomes 
unmanageable. She used her exact distribution to check Bishop’s approximations. Table 6 
is taken for George’s Table 1 and shows the equivalent value of M obtained by Bishop’s 
empirical formula, method (a), for the 5°, point, together with the exact value of the 
probability obtained by George by direct integration. The probability corresponding to this 
value of M has also been calculated by the x? and F approximations suggested here. Thus the 
closeness with which exact probability approaches 0-0500 indicates the accuracy of Bishop’s 
method, and the closeness with which the probabilities for the y? and F approximations 
coincide with the exact probability measures the accuracy of these approximations. 

We see that the values given by the F approximation are in excellent agreement with the 
exact probabilities, and even the x? approximation is considerably better than Bishop’s 
method. Unfortunately, no exact values are available in the cases where p and k are larger, 
when approximation to the curve is more difficult. For these distributions the series given by 
formula (26), using in most cases up to six* terms in the asymptotic series and up to eight* 

* When v = 9, and p = 5 and 6, the coefficients a are rather large, and ten and fourteen terms respec- 


tively had to be used in the exponential series. When p = 6 there is evidence that further terms in the 
asymptotic series would give closer agreement. 





Tal 











| p=2 
| k=2 
\| 
}] 
| 
a 
1 | 
| k=3 























































































































G. E. P. Box 335 : 
Is Table 6. 5°, points for M given by Bishop's empirical approximation with their associated 
yy probabilities calculated by: (1) George’s exact method, (2) the F approximation, (3) the x 
approximation 
| 
| | Probabilities Probabilities 
i | » M v M 
Exact F x? Exact F x? 
2; (George) | (Box) (Box) (George) | (Box) (Box) 
_, | | 
| 
rp=2 | 9 | 8-824 0-0492 0-0492 0-0492 p=3 9 15-740 0-0461 0-0458 0-0446 
k=2 | 14 | 7-835 0-0495* | 0-0494 0-0496 k=2 14 14-434 90-0475 0-0475 0-0470 
= 19 7-831 0-0496* | 0-0496 0-0496 24 13-598 0-0485 06-0486 0-0485 
24 | 7-828 | 0-0498* | 0-0497 0-0497 29 13-416 0-0488 0-0489 0-0487 
\| 39 13-211 0-0488 0-0489 0-0488 
ES en ast: 
i | | } 
iip=3 | 98 | 14-164 | 0-0491 | 0-0491 | 0-0490 | 
= | k=3 | 19 | 13-285 | 0-0496 | 0-0497 | 0-0496 | p=3 | 14 | 23-661 | 0-0481 | 0-0479 | 0-0473 
| 29 | 13-031 | 0-0497 | 0-0499 | 0-0499 | k=3 | 29 | 22-288 | 0-0484 | 0-0480 | 0-0478 
ae See phe ee Se it 
} | 
| | | p=4 | 19 | 20-989 | 0-0461 | 0-0461 | 0-0455 
| k=2 | 29 | 19-946 | 0-0477 | 0-0478 | 0-0476 
| | | 
8) * These values have been recalculated ont tas not agree with the values given in George's table. 
Table 7. Comparisons of approximations. Significance points for M obtained by varvous 
‘ methods, with probabilities given by series (26) 
| ] p=2 p=3 p=4 | p=5 p=6 
ee ——— el Sg Na 5 Saiee AOE yl Dot Cale EAN 
9) la Aa | 
te 6 | 59% | Bishop (a) | 23-40 0-0485 | — - 71-07 0-0434 — — 173-17 0-0105 
| v=9 | Bishop (6) | 23-06 0-0531 - - 67-38 0-0742 —_ — | 142-19 0-1633 
ic | Box (2) 23-27 00-0503 | 42-56 0-0532 | 68-93 0-0597 | 103-65 0-0673 | 148-30 0-104! 
- | Box (fF) | 23-30 0-0500 | 42-83 0-0506 | 69-84 0-0524 | 106-40 0-0545 | 153-36 09-0692 
: | | Type I 23-26 0-0504 | 42-88 0-0502 | 70-28 0-0492 | 107-15 0-0500 | 157-38 0-0488 
6 | 
2 et + 
gi. 4 Bishop (a) | 29-19 0-0096 . 81:35 0-0082 — -— 192-34 0-0010 
ad aay? (6) | 28:82 0-0107 . — | 77-01 00173} — — | 156-35 0-0533 
is | Box (x?) 20-01 O0-O101 | 50-24 G-OLLL | 78-74 0-0135 | 113-54 0-0225 | 163-16 0-0286 
| | Box (F) 29-05 0-0100 | 50-58 0-0102 | 79-84 0-0105 | 118-56 0-0122 | 168-94 0-0165 
“f | Type I | 29-07 0-0099 | 50-59 0-0102 | 80-45 0-0097 | 120-200-0098 | 173-78 09-0097 
e Bee. Snes See Ee ee 
a | 
|}k=5 | 5% | Bishop (a) | 22-14 0-0486 - 61-63 0-0489 — —- 124-25 0-0469 
| v= 19 | Bishop (6) | 22-13 0-0486 - — | 61:36 00511| — — | 121-29 0-0660 
1e | Box (x?) | 22-03 0-0501 | 39-09 0-0506 | 61-31 0-0515 89-08 00532 | 122-82 0-0556 
, | Box (F) 22-04 00-0500 | 39-14 0-0501 | 61-47 0-0502 89-47 0-0505 | 123-61 0-0508 
jug Type I 21-92 0-0516 | 39-10 0-0505 | 61-36 0-0511 | 89-59 0-0496 | 123-70 0-0503 
Tr, 
7, @ 
a 1% | Bishop (a) | 27-55 0-0098 — — 70-50 0-0095 a= — 137-09 0-0088 
L | Bishop (6) | 27-34 0-0104 — — 62:95 0-0106 — — 133-60 00-0144 
Box (x?) 27-47 0-0100 | 46-14 0-0102 | 70-03 0-0105 | 99-56 06-0109 | 135-13 0-0116 
_ Box (Ff) 27-48 0-0100 | 46-20 0-0100 | 70-23 0-0101 | 100-01 0-0101 | 136-03 0-0103 
he Type I | 27-55 0-0098 | 46-24 06-0099 | 70-27 0-0100 | 100-24 0-0098 136-32 00-0099 
é Biometrika 36 







































































22 














336 A general distribution theory for a class of likelihood criteria 


terms in the exponential series, may be used as a standard for comparison. Table 7 shows the 
significance points for M obtained by five different methods together with the probabilities 
calculated from the series. The methods are: Bishop’s empirical approximation (a), Bishop’s 
approximation (b), the x? and F approximations suggested in this paper, and the fitting of 
a type I curve by exact calculation of the first two moments. The values for M for Bishop’s 
approximations and the type I approximation have been calculated from Bishop’s significance 
points for 1, given in his Tables 9 and 10. 

If we take the series solution as supplying essentially accurate values, we confirm Bishop’s 
suggestion that the type I curve, fitted exactly to the first two moments of /,, provides an 
exceedingly good approximation. Of the working approximations, the F approximation 
suggested here appears to be the best and the x? approximation with the generalized scale 
factor C will be fairly satisfactory if p and k are not greater than five and v is not less than, 
say, twenty. 

Table 8 supplies a few comparisons with equal and unequal degrees of freedom. 


Table 8. Significance points for M from x? and F approximations for some equal and unequal 
groupings, when p = 4 and k = 5, with associated probability given by series (26) 





















































ante : tian 
N | Ve V3 "4 Vs 5% | 1% | 
; 
95 19 19 19 19 19 x 61-31 0-0515 70-03 0-0105 | 
F 61-47 0-0502 70-23 0-0101 
| 
es ~ ‘a ants 
95 9 9 19 29 29 x* 63-22 0-0578 72-33 0-0124 | 
F 63:99 00521 73-14 0-0107 
95 f) 9 | 9 9 59 x 66-32 0-0627 75-76 0-0139 
F 67-39 0-0535 77-07 0-0110 
a | 
. —| |, ae Sockaamiaee 
45 ae US ee ae 9 | 9 | x 68-93 0-0597 78-74 0-0135 
| | | | | * 69-84 0-0524 79-84 0-0105 
ts | Ske al 








It appears, at least for unequal samples with none of the degrees of freedom less than 9, 
that the F approximation will be fairly satisfactory. 


5. GENERALIZATION OF THE PROCEDURE 


The method we have developed has so far been illustrated in the case of the univariate and 
multivariate tests of homoscedasticity; its application is, however, more general. In 
fact, the method can be used whenever, by choosing a suitable power of the original criterion, 
we can obtain a statistic W which has its Ath moment of the form 


k hom 

Thy”) | TE (Pfr +h) +E) 

&(W)* = constant x rt a --—, (70) 
It (x;7*) Il [M{y;(1 +h) + 9,3] 


j=1 

















nd 


10) 





G. E. P. Box 337 


where Uy = Dy;- 


(The constant will be obtained of course by putting h = 0 in (70) and taking the reciprocal.) 
Many of the tests in Plackett’s review, referred to in the introduction to this paper, fall into 
this category. We have already seen that the generalized L, statistic is of this type; others 
are Wilks’s test for the independence of k groups of variates (which has some important 
special cases; and will be considered in detail in the next section); the generalized test for 
constancy of means, variances and covariances for k samples given by Wilks (1932); and the 
tests for ‘compound symmetry’ of variance-covariance matrices discussed by Votaw 
(1948). 

Another group of criteria, which has been studied by Mauchly (1940) and Wilks (1946), 
arises from tests made on a single sample of n p-variate observations. Mauchly’s criterion 
tests the hypothesis that the variances of the variates are all equal and that the covariances 
between the variates are all zero. Wilks considered criteria for testing three further 
hypotheses: 


(a) That the » means, p variances, and }p(p»— 1) covariances for the variates have respec- 
tively the same unknown values. 

(b) That the variances are the same and the covariances are the same irrespective of what 
values the means have. 

(c) That the means are the same (assuming (5) true). 


It is hoped to consider some of these tests rather more closely in a later paper. Here we 
shall merely note that, except for Wilks’s third criterion (which is always distributed exactly 
in type I form), the exact distribution of the test function is, in general, not exactly known. 
The expression for the hth moment, however, is in each case of the form of equation (70) and, 
as is shown below, our previous approach will provide approximations in all these cases, 
Tukey & Wilks (1946) have considered this class of statistics and have pointed out that 
they all possess in common the property that, when the null hypothesis is true, they are 
distributed as a product of independent components, each component being distributed in 
type I form. 

Consider the expression (70) for the hth moment of any statistic W of this type. If we 
take M = —2log W as our working statistic, and write {l—p)a,; = £;, (l—p)y; = €;, where 
pis a constant <1 at our choice, we find for the cumulant generating function of pM 


Fé) = g(t) —g(), 


m k 
where g(t) = 2inp| Sa, logx;— ¥ y; log ¥,| 


i=1 j=l 


m k 

+ ¥ log P{pa,(1 — 2tt) + £,+£}—- ¥ log M{py,(1 — 2it) + €; + 9;}, (71) 
i=1 j=1 

and (0) is independent of t and is obtained by writing ¢ = Oin (71). Expanding the logarithms 

of the I’-functions by (18), we obtain the cumulant generating function of pM in the form 


V(t) = Q—g(0)—4 log (1—2it) + ¥ (1 2%t), (72) 
- r=1 








338 A general distribution theory for a class of likelihood criteria 





m k 

where = _ of , Ei- , UF ae § (73) 
“a j=1 j 
|  Bilbit $i) S Brals+) 
wo, = ———| > Basal ! _>: otal 74 
= rel lea (pm (ou? ™ 
Q = k(m—k) log 27 loge + 3 > (2; + §;—})log2;- ¥ . (y; +; — 3) log y;. (75) 
i=1 j= — | 


From the cumulant generating function (72), the asymptotic x? series corresponding to 
(26) and (30) are immediately obtainable. Alternatively, we may obtain approximations in the 
manner given in$3; the method outlined there is clearly perfectly general for this whole class 
of statistics. We — the quantities A, = 2w{/f and A, = 4w5/f, where wis the value taken 
on by w, when p = 1. Thescale factor C for the y? approximation will be 1 + A, or (l—A,)}; 
a decision between 8 two alternative forms can be reached by the seadiendion set out 
in §3-1. Then, to this urder of approximation, M/C will be distributed as y?. If greater 
accuracy is required we may use the / type approximations described in § 4. The particular 
form is decided by the sign of the quantity A,— Aj. If this quantity is positive, the curve of 
best fit will be type VI. Putting 

A-h hm: f+? eer (76) 
s — A?’ 1-A,—j,/f,’ 
M/b is distributed approximately as the variance ratio F with f, and f, degrees of freedom. 
Alternatively, if A,— A? is negat‘ve the best fitting curve will be type I, and if we put 


n +2 a Se -- 
h=f, h= A?— A,’ he i—A,+2/f,’ (44) 


then approximatel YF ae will be distributed as F with f, and f, degrees of freedom. 
i 

There are thus a number of possible levels of approximation as measured by the order of 
agreement between the cumulants of the statistic and those of the fitted curve. 

(1) Ignoring terms of order 27", y;', M is distributed as y*. 

(2) Ignoring terms of order aj ?, yj ?; by a technique originally used by Bartlett and here 
generalized, a quantity C can be found such that M is distributed as Cy°. 

(3) Ignoring terms of order 2; *, yj *, a function of M can be obtained which is distributed 
as the variance ratio F. 

(4) Finally, for very precise work and for checking other approximations, a x* series 
solution may be used and here agreement with the cumulants (as represented by their 
asymptotic expansions) of the statistic can be obtained toas great an orderasseems profitable. 

In practice method (4) is sometimes rather long, although it has been found very accurate, 
but (3) involves very little labour and will often be sufficiently precise. 

As a second example of the application of this technique we consider Wilks’s generalized 
test of independence. 


6. THE GENERALIZED TEST FOR INDEPENDENCE 


Wilks (1935) considered the following problem: suppose we have a sample of v + u obser- 
vations for a kp variate normal population and we have some a priori reason for dividing 
the variates into k groups containing pj, ...,P,,---, Pp ---» Py Variates (wears 7, ¥ Pp, = kp and 


p is thus the average size of the groups and is not necessarily integer). It is esis to test 





th 
to 


v 
n 
0 





(73) 


(74) 


(75) 


x to 
the 
lass 
ken 
Be 
out 
iter 
ilar 
e of 


(76) 


om. 


tr of 


1ere 
ited 
ries 
heir 
ble, 


ate, 


ized 


G. E. P. Box 339 


the hypothesis that the k groups of residuals, obtained after titting u independent constants 
to each of the variates, are mutually independent. 

If | c;; | is the kp x kp determinant of sums of squares and products of residuals for the kp 
variates and | c;; |, is the », x p, determinant of sums of squares and products of residuals of 
the /th group, then the likelihood ratio criterion obtained by Wilks is 








xa ee ee (78) 
LL | 4; | IL | ris | 
i=1 l=1 


where | r;;| and |r,; |, are the corresponding determinants of sample correlation coefficients 
having v degrees of freedom. Wilks obtained the moments, and also, for special sets of values 
of k and p, the exact distribution of his criterion which generalizes a very large class of 
statistical tests. Problems in which there are more than two groups of variates, i.e. where 
k > 2, occur for example in educational research; we may have some prior reason for believing 
that a battery of, say, ten different tests applied to pupils may be divided up into a number 
of groups, each group concerned with some distinct ability, and may wish therefore to test 
the hypothesis that, when the means are eliminated, the selected groups are independent 
of each other. 

When & = 2, we consider only two groups of variates containing p, in the first and p, in 
the second. Since the criterion and its distribution will be unaffected if the set of p, variates 
are fixed independent variables and the set of p, variates ‘dependent’ variables distributed 
in a p,-variate normal distribution, the function is then appropriate for testing the general 
multivariate linear hypothesis (see, for example, Bartlett, 1934, 1938, 1947). If, in addition, 
p, = 1, then the likelihood criterion is A = 1— R®, where R is the coefficient of multiple 
correlation between the single dependent variate and the p, independent variates. A second 
special case of Wilks’s statistic which is of some interest, and is considered more fully later 
in this section, occurs when there is only one variate in each of the k groups. The statistic 
then supplies an oyerall test for independence between & variates. For the general statistic 
Wald & Brookner (1941), using a rather different technique from that of Wilks, were able to 
extend the catalogue of values of k and p, for which the distribution of A is exactly known in 
terms of elementary functions, to include all cases where at most one group contains an odd 
number of variates. These distributions, although exact, are rather complicated in character. 
As an alternative and to cover the remaining cases, these authors obtained a series solution 
and Rao (1948) modified this series in the important case where k = 2 to provide an improved 
test in problems of multivariate analysis. These series will wus appear as special cases of 
that which we are now investigating. 


6-1. Derivation of the series 
The Ath moment of A is given by Wilks as 


Tl ) re aL tel 
Mr lpesdea| i et 


So that if we write W =A, the Ath moment of W will be in the form given in equation (70); 
taking as our logarithmic statistic M = —2log W we obtain 


k p-l 








(79) 


M= — vlog A. (80) 













340 A general distribution theory for a class of likelihood criteria 


To obtain the series, we begin as before by defining the relationship between a quantity p 
(less than unity) and quantities ~ and f# by the equations p = w/v, v = 4+. It is also con- 
venient to define a set of quantities 


Ls = (xn) - ~ BP (81) 


which appear in the solution in much the same way as the quantities ra ve 2Ppeat in the 


¥ 
tests for homoscedasticity. Then, as before, we obtain equation (72) for the cumulant gener- 
ating function of the pM, and the constants are available by direct substitution in (73), 
(74) and (75), 


f= }2,, (82) 
B-S Pa-i 
leek. We pei... dle <3 B-j ei) 
aa Sa rr+ 1) pics 2, Braa( 32) an) Wer eam | © (83) 
Q= Log’. (84) 


The calculation of «, from formula (83) would clearly be extremely laborious for all but 
small numbers of variates; we therefore seek an alternative simpler form. Using relations 
(32) and (35), we find 


1 ah as iy r+1 (ry +23 > ay i-1 pee-a én 
me r(r+ Hose Biles i) 2 » {2 " -(¢- 2] j 8,( Pr), (85) 
B+ B) 
ses nove. —22%) 2,42), 00 


and the values taken by (86) when p = 1, 2,...,7, are given by putting p = p, in equation 

(42). Writing «, for the value which a, has when p = 1, i.e. when # = 0, then substituting for 

6,(;) in (85) and summing, we obtain for the first six values of a: 
ay = ga{22y + 324}, 

Os = ae + 2%,—2%,}, 





(87) 
as = gh ae a ss 72, — 352, + 725+ 492,}, 
Oy = goye{3X, + 12D, — 14D, — 84E, + 21D, + 196E, — 102,}, 
where &, is defined by equation (81), whence we have for the «’s 
a, = a —(f/2) 8, 
Oy = %— af + (f/4) B®, 
ay = a4, — 2a +2, f*~ (6), 
(88) 


aug = a, — Bai, + Bar, f*— a, f+ (f/8) Bt, 
des = a — dar, + Gat, B2— dar, f+ ar f*— (f/10) fi, 
de = a — 5a f+ 1004 f? — 1003 83 + 5a, 84 — a} f° + (f/12) f*. J 








t 
t 
r 
I 
( 
‘ 





ut 
ns 


35) 


6) 


on 
or 


G. E. P. Box 341 


As before, from the cumulant generating functior. we obtain the series corresponding to 
(26) and (30), and if p is chosen so that a, = 0, we ave 


” Es , 


Wald & Brookner (1941) derived a y? series for tius statistic by a different method from 
that used here; it is not difficult to show, however, tnat the series they obtained is equivalent 
to our series, but with p = 1. In this form the series is of little practical use for small, or even 
moderate values of v because of the difficulty we have noted before of adequately approxi- 
mating to exp La,~"(1 — 2it)-” by means of a series, unless «, is small or y is large. By intro- 
ducing the factor p, the size of the coefficients « can be greatly reduced and the series be used 
even for fairly small values of v. As an example, consider the case of three groups of variates 
with two variates in each grouping, k = 3, p, = 2, p, = 2, ps = 2, and suppose v = 10. The 
values of the coefficients «,/u” are shown below whenp = | and also when p takes on a value 
making a, zero. When p = 1, wu is of course equal to v. 


Values of a,/y" 


r o—t p=0-683 

l 1-900,00 0-000,00 

2 0-335,00 0-073,17 

3 0-086,33 0-003,71 

4 0-026,88 0-001,78 

5 0-009,40 0-000,23 

6 0-003,55 0-000,07 
Total 2-361,16 0-078,96 
g(0)—Q 2-363,61 0-078,98 


For the Wald & Brookner series, if v is small, the coefficients are so large that in practice 
it would be impossible to represent the exponent adequately by a reasonably small number of 
terms of the exponential series; by suitably choosing p, however, the size of the coefficients 
are greatly reduced while the agreement between the sum of the terms and g(0)— Q is im- 
proved. In the particular example quoted, the exact distribution is known (Wilks, 1935). It 
appears in rather a complicated form, but has been used here to check the series and the 
approximations. Table 9 shows the 5% and 1 % significance points for the criterion M 


Table 9. Some comparisons for Wilks’s statistic 











x? approximation F approximation 
| 2s 
M Probability M Probability 
| Exact Series Exact Series 





v=10 5% 30-770 0-0612 0-0612 31-357 0-0549 0-0548 





k=3 1% 38-366 0-0139 0-0139 39-280 0-0117 0-0117 
p,=2 al 
P2= 


ps=2 | v=20 | 5% | 24-982 | 0-0516 | 0-0516 25-083 | 0-0504 | 0-0504 
1% 31/149 | 00105 | 0-0105 | 31-292 | 0-0101 0-0101 









































342 A general distribution theory for a class of likelihood criteria 


obtained by using the y* and F approximations which are derived in the next section, together 
with the exact probabilities and the probabilities calculated from the series, using p = 0-683, 
and six terms in the asymptotic and eight in the exponential series. Agreement to four places 
of decimals is usually obtained between the series and the exact value for the probability. 


x* approximation. Following the previous procedure, we find that M/C is distributed 
approximately as x* with f degrees of freedom, where 


l 1 a 
CG = ]-— 12yf 225+ 3X4) and f = 424. 
F approximatio 1. We have 
1 1 a 
f=}2, A,= Tap 2+ 323), A, = 12h 24+ 223-2), 


from which, using equations (76) and (77), the F type approximation can be easily computed. 
The quantities 2, &, and X, required in these approximations are given by (81), the 


calculations of Table 9 give some indication of the accuracy to be expected. 


6-2. Special cases 
We consider two important special cases of the statistic, that in which there are only two 
groups of variates and that in which there is only one variate in each of the & groups. 
6-21. Case k = 2 


In this case the expressions for the coefficients in the series simplify considerably. Writing 
Py = P, Po = q, We obtain 





f=Pt, p= pPEEtT B= Mp+q+)), 

a,=0, ma i (p+ q?—-5), 

ia= 0, &= iS (+ 3q4+ 10p%q? — 50( p? + q2) + 159}, 
Py 


a,=0, a= 16 128 3(P* + 9°) — 105( p* + g*) + 1,113( p? +g?) 
+ (21 p* — 350 + 21g?) pg? — 2,995}. 


Putting these values in (26) and (30) we confirm* the series given to terms in y~* by Rao 

(1948) for this case, k = 2. Bartlett had already (1938) obtained the x? approximation using 
] 9+q4+1 eg 7 

the scale factor G= 1-2 ra (which is of course the factor given by the present pro- 
cedure). Rao introduced this scale factor into the Wald & Brookner series, so as to obtain a x? 
series with Bartlett's x* approximation as the leading term, equivalent to (30). As we have 
seen, this choice of factor results in this particular case in a, and the a’s of odd order being 
zero, so that the calculation of the series is correspondingly simpler. 


* There appears to be a misprint in Rao’s paper in the expression which corresponds with a,, where 
the constant 159 is wrongly given as 150. 





Ww 


al 


fi 


ee 





ier 
33, 
eS 


ed 


he 


1g 


G. E. P. Box 343 


pt+qt+l 


aorta M 
y? (Bartlett, approximation. ra; -(1- op 


ju is approximately distributed as x? 
with f = pq degrees of freedom. 

F approximation. We find A,— A? = (p?+q?—5)/12v?; thus for p and q>2, A,— A?>0, 
and the type VI form will be appropriate. M//6 will be approximately distributed as F with 
f, and f, degrees of freedom, where 





12y?( py + 2) Pq 
= OS eee ee ee ij = " 
fi ry, Se p+g—5 2 _ptg+l fy 
2p fe 


For p or g equal to | and 2, the exact distributions are known and provide simple tests 
(Wilks, 1932, 1935). For these cases A and ,/A respectively are distributed in a type I dis- 
tribution, and the significance test can be made, either by directly entering Thompson’s 
tables of percentage points of the incomplete B-function, or by inversion of the statistic to 
its equivalent ‘variance ratio’ form and using tables of F or of Fisher’s z (Bartlett, .1934; 
Rao, 1948). As has been pointed out by Bartlett (1938) ifp = landgq = 2 (orp = 2andg = 1) 
- = a M is distributed exactly as x’, and substituting these values for p and q in the expres- 
sions for «, we find that in this case all these coefficients are zero, so providing a useful check. 
If p and q were both unity, A,— A? would be negative and the type I form be appropriate 
for the F approximation. Of course we shall not need to use the method here because the 
criterion ,/(1 — A) is the sample correlation coefficient r and the exact distribution is known. 
The exact distributions are also known in certain other cases (Wilks, 1935; Wald & Brookner, 
1941); the form which these take, however, is rather complicated, but they are useful to 
check approximations. In Table 10 are showr the 5 % significance points of M for a number 
of combinations of p and q as given by the x? and F methods of approximation. In the cases 
chosen, the exact distribution is known, and this has been used to calculate the exact pro- 
bability associated with each of these points. For comparison, the probability given by the 
series, using terms up to a, in the asymptotic series and, for most values, up to a, in the 
exponential series, is also shown. 

Wesee that, providing v is sufficiently large, Bartlett’s approximation is in good agreement 
with the exact values, and the F approximation, since it involves very little more labour, 
provides a worth-while improvement. If v is not large and one is doubtful whether these 
approximations will be sufficient, a rough but useful indication is provided by comparing 
the values obtained by the y? and F approximations (in calculating the F approximation 
one will have already calculated the quantities needed for the x? approximation). If these 
two approximations give substantially the same value, it may generally be taken as an 
indication that the approximation is adequate. If they differ markedly, a more accurate 
value should be calculated from the series. 

6:22. Case p, = 1,1=1,2,...,k 

If the k groups each contain only one variate, the hypothesis tested is that each of the 
variates is independent of all the others. The A criterion then becomes the determinant of 
the sample correlation matrix, e.g., if k = 3, 


The Tig 
































344 A general distribution theory for a class of likelihood criteria 


Table 10. 5 % significance points for M 







































































x* (Bartlett) F (Box) 
P q v Probability Probability 
M M 
Exact Series Exact Series 
1 1 9 4-610 0-0494 0-0494 4-592 0-0499 0-0499 
j 
| 
1 5 10 17-032 0-0624 0-0624 17-542 0-0555 0-0555 
20 13-419 0-0518 0-0518 13-504 0-0504 0-050 | 
l 10 20 26-153 0-0666 0-0666 27-022 0-0562 0-0562 
40 21-538 0-0525 | 0-0525 21-690 0-0505 0-0505 
| 
2 2 9 13-137 0-0515 0-0515 13-200 0-0506 0-0506 
2 5 10 30-512 0-0737 0-0737 31-654 0:0614 0-0614 
20 22-884 0-0529 0-0529 23-053 0-0507 0-0507 
2 10 20 46-534 0-0753 0-0753 48-164 0-0595 0-0595 
40 37-505 0-0535 0-0535 37-775 0-0507 0-0507 
4 4 10 47-811 0-0945 0-0940 49:996 | 0-0735 0-0731* 
20 | 33-931 0-0542 | 0-9542 34-216 | 0-0512 0-0512 





* With v= 10 and p=q=4, six terms were taken in the asymptotic series and twelve in the exponential 
series; greater accuracy can be obtained by taking more terms in the asymptotic series. 


where r;, is the usual sample product moment correlation coefficient between the ith and jth 
variates. When k = 2 the criterion is simply 1 —r?,. 

The statistic is useful in supplying an overall test of independence between the k variates. 
For example, when k = 5 there will be ten individual correlation coefficients. Even when the 
null hypothesis, that ail the variates are uncorrelated, is true, we shall expect often to come 
across individual coefficients which are ‘significant’. For such a case it will be appropriate 
to apply the overall test before testing individual correlations. Again, the expressions for the 
coefficients simplify and we find, choosing p so that a, = 0, 


2k+5 Pt a 
6y ’ Vrs 4 





f= 4kk-1), p=1- 


(k—1 
a,=0, a= AE) (2k — 2k 13), 


(k—2)(2k—1)(k+1), 














G. E. P. Box 345 
k(k—1) 





= “4 2}3 9]}-2 
4 = 34560 (16k4 — 32k3 — 252k? + 268k + 1147), 
oe 2 2 ._ On 
Os = T59,000 ¢— 206+ 12-1) (68k — 97), 
_k(k- 





496K — 1,488K5 — 12,576k* + 27,632K3 + 137,490k2 — 151,554k — 562,103 
= 7 838, so oo a bs m ) 


For the xy? approximation we find, from the argument of §3-1, that we should take 
1 2k+5 2k+5 
. Thus {1—- 








a: oe 6v 
degrees of freedom. 
For the F approximation we have 


| aw will be approximately distributed as x? with $k(k—1) 





2k+5 k?+3k+2 
f = $k(k-1), cs tear oot , A? . 
ne 2k? 2k—13 
For k = 2and 3 we use the type I form and fork > 4 the type VI, since A, — A? = 368 


is negative when k = 2 or 3 and positive for larger values of k. We then calculate j,, f, and 6, 
required in this approximation, by formulae (77) and (76) respectively. 


- 


7. SUMMARY AND CONCLUSIONS 


For a particular class of likelihood criteria, whose moments appear as the product of I- 
functions, a general method is described for obtaining probability levels when the null hypo- 
thesis is true. A number of statistics whose moments appear in this form are referred to, and 
a general method developed to obtain: 

(a) A series which is in close agreement with the exact distribution. 

(b) An approximate solution, using a single y? distribution, which is sufficiently accurate 
for moderate or large samples. 

(c) A rather better approximation, using a single F distribution, giving close agreement 
even when the samples are rather small. 

The method is illustrated for the following two general statistics: 


(1) Tests for constancy of variance and covariance 


(a) Univariate case. The F approximation is of the same order of accuracy as Hartley’s 
(1940) series solution although it requires very much less calculation, and significance may 
be judged by consulting tables of the significance points of the variance ratio F alone. 

(b) Multivariate case. The series solution shows remarkably close agreement with the 
exact distribution when this is known, and is used in other cases to compare approximations. 
The x* approximation does not correspond with that found by Bishop (1939), but is, in 
fact, simpler and more accurate. 

The series confirms the accuracy of significance points found by fitting a type I curve to 
the first two moments of l,. The calculation of the moments involved in this method renders 
it too laborious for routine use, and Bishop suggested two working approximations; the F 
approximation developed here is more accurate than these approximations, whilst it involves 
no more labour and can be used when the sample sizes are unequal. 








346 A general distribution theory for a class of likelihood criteria 


(2) Wilks’s test for independence of k groups of variates 

The asymptotic series, and x? and F approximations are derived for this case, and the 
relation of the results with those of Wald & Brookner, Bartlett, and Rao is discussed. The 
exact distribution is used to assess the accuracy of the proposed methods in a number of cases. 
The probabilities given by the series are found to be in excellent agreement with the true 
values, even for fairly small samples. Providing the sample sizes are not too small, the x? and 
F approximations will be sufficiently accurate, the latter providing the better approximation, 
and allowing the sample size to be rather smaller than is possible with the x* approximation. 
When the number of variates in each group is one, we have a test criterion for the hypothesis 
that k variates are mutually independent, and the same procedure provides the series solution 
and simple approximations for tests of significance. 


In conclusion, I wish gratefully to acknowledge the help and guidance I have received 
from Dr H. O. Hartley throughout this investigation. 


REFERENCES 


BARNES, E. W. (1899). Mess. Math. p. 64. 

BarTLett, M. S. (1934). Proc. Camb. Phil. Soc. 30, 327. 

BARTLETT, M. 8. (1937). Proc. Roy. Soc. A, 160, 268. 

BaRTLETT, M. 8. (1938). Proc. Camb. Phil. Soe. 34, 33. 

BaRTLeETT, M. S. (1947). J.R. Siatist. Soc. Suppl. 9, 176. 

Bisuop, D. J. (1939). Biometrika, 31, 31. 

Bisuop, D. J. & Narr, U. S. (1939). J.R. Statist. Soc. Suppl. 6, 89. 
Brown, G. W. (1939). Ann. Math. Statist. 10, 119. 

GrorceE, A. (1945). Sankhya, 1, 20. 

Hart ey, H. O. (1940). Biometrika, 31, 249. 

Hartiey, H. O. & Pearson, E. S. (prefatory note) (1946). Biometrika, 33, 296. 
MAUCHLY, J. W. (1940). Ann. Math. Statist. 11, 204. 

MILNE-THOMSON, L. M. (1933). The calculus of finite differences. Macmillan. 
Narr, U.S. (1939). Biometrika, 30, 274. 

NayEr, P. P. N. (1936). Siatist. Res. Mem. 1, 38. 

NrEyMAN, J. & Pearson, E. 8. (1928). Biometrika, 20 A, 175 and 263. 
NEYMAN, J. & Pearson, E. S. (1931). Bull. int. Acad. Cracovie, A, p. 460. 
NEYMAN, J. & Pearson, E. 8. (1936). Statist. Res. Mem. 1, 1. 
NEYMAN, J. & Pearson, E. 8. (1938). Statist. Res. Mem. 2, 25. 
Pearson, E. 8. & Wiiks, 8. 8. (1933). Biometrika, 25, 353. 

Pitman, E. J. G. (1939). Biometrika, 31, 206. 

PuacKEtTT, R. L. (1946). J.R. Statist. Soc. 109, 457. 

Piackett, R. L. (1947). Biometrika, 34, 311. 

Rao, C. R. (1948). Biometrika, 35, 71. 

Tuompson, C. M. (1941). Biometrika, 32, 151. 

Tuompson, C. M. & MERRINGTON, M. (1946). Biometrika, 33, 296. 
Tukey, J. W. & Witks, S. S. (1946). Ann. Math. Statist. 17, 318. 
Voraw, D. F. (1948). Ann. Math. Statist. 19, 447. 

Wa cp, A. & BrRooxner, R. J. (1941). Ann. Math. Statist. 12, 137. 

We tcu, B. L. (1935). Biometrika, 27, 145. 

We cu, B. L. (1936). Statist. Res. Mem. 1, 52. 

Wis, 8. 8.' (1932). Biometrika, 24, 471. 

Wiiks, 8. 8. (1935). Hconomeérica, 3, 309. 

Witks, 8. 8. (1946). Ann. Math. Statist. 17, 257. 

WisHart, J. (1928). Biometrika, 20 A, 32. 


= 








. 
4 


1 


[ 347 ] 


NOTE ON APPROXIMATIONS TO. THE POWER FUNCTION OF THE 
‘2x2 COMPARATIVE TRIAL’ 


By G. P. SILLITTO 
Research Department, Imperial Chemical Industries, Ltd., Nobel Division 


The power function of the 2 x 2 table arising in what Barnard (1947) has called the ‘2 x 2 
comparative trial’ and Pearson (1947) has designated ‘Froblem II’ has been discussed by 
P. B. Patnaik (1948) in this journal. In his investigation he made approximations of the type 
which involve representing binomial or hypergeometric distributions by normal distribu- 
tions, and in certain cases he examined the adequacy of the approximations numerically, 
by comparing values of the approximate power function he derived, with values calculated 
exactly. 

There is some interest in comparing Patnaik’s approximation with that obtained by using 
the angular transformation for a binomial variate. If P is the probability that an individual 
will possess a given character and f = r/n is the proportion or relative frequency of individuals 
with this character observed in a random sample of n, then it is known that 


2 = aresin,/f, (1) 


where the angle is measured in radians, is distributed approximately normally about a mean 
of arcsin,/P with a standard deviation of }n-?. The problem of comparing two observed 
values of f, thus becomes equivalent to that of comparing variables from normal populations 
with known standard deviations. 

The transformation (1) has of course been known for a long time. It has recently been 
discussed by Eisenhart (1947), who gives a bibliography, and Paulson & Wallis (1947) have 
indicated its application in the planning and analysing of experiments for comparing two 
percentages. Bartlett (1937) has given a table of the transformation, the angle being in 
radians, while Fisher & Yates (1938) have tabulated it, showing the angle in degrees. Since 
the reason for using a transformation is essentially one of convenience and since the use of 
the radian as the unit leads to a slightly simpler expression for the variance, there are 
advantages in employing the latter unit. For the case of two independent samples, if the 
observed relative frequencies f, and f, are based respectively on m and » observations, then 
the difference between their respective transformations x, and x, will be approximately 
normally distributed with known standard error, 


,. . . “ 
=3,/(ata): (2) 
2 \m n 
Thus (%,—2,)/o will vary about a mean 
fe = (arcsin,/P,—aresin,/P,)/c, (3) 


with unit standard deviation. If the null hypothesis is true, P, = P, and x = 0. Defining 
u, as follows, 


Po, 2 
= —_— e- di. (4) 
\ Jem 











348 Approximations to the power function of the ‘2 x 2 comparative trial’ 


the chance of establishing significance at the 100« % level, when P, + P,, i.e. the power of 
the test, will be approximated as follows: 


- 1 
(a) For one-sided test Power = —— e#* di. (5) 
ua—n(277) 
(6) For two-sided test 
— Ula 1 fa 1 
Sentin | nig WF db, _—e-¥ dt, (6) 
uid —o (27) Jan Jn) 


The position is illustrated in Fig. 1. For the one-sided test the power is the area under the 
curve centred at lying to the right of the ordinate at u,. For the two-sided test it is the sum 


of the areas under this curve lying to the left of the ordinate at — u,, and to the right of the 
ordinate at u,,. 


Distribution on the 
null hypothesis, P; =P 








- uy e uy a 
| 1 Distribution when 
! | P.>P, 
\ J 
i 
| 
| 
l we . ~~ 
0 u, yh 


Fig. 1. Sampling distributions of (x,—x,)/o= {are sin \/f,—are sin vin/ 5 (e+) 

In order to obtain an indication of the usefulness of the approximation to the power of 
the 2 x 2 comparative trial which is afforded by using the transformation (1) and the normal 
distribution theory just outlined, values for the power of the test have been calculated for 
the cases for which Patnaik evaluated the true power of the 2 x 2 table. The results are given 
in Tables 1 (a), (b), 2(a) and (b) below. The first two tables correspond to Patnaik’s Tables 
2 (a) and (b), and the bracketed figures are the values for the power as calculated from the 
approximate theory given above, while the other figures are the values for the true power, 
given by Patnaik. Similarly, in Tables 2(@) and (b), the exact values are quoted from 
Tables 3 (a) and (b) in Patnaik’s paper, and the corresponding approximate values from the 
present method are shown for comparison. 

It is evident that the approximation is quite good, over the range covered by the present 
tables. It appears, in fact, to be slightly better on the whole than Patnaik’s, a detailed com- 
parison of the figures with those in his paper giving the results shown in Table 3. Omitting 
the P, = P, diagonals of Tables 1 (a) and (6), there are 172 cases in which numerical com- 
parison is possible. 

It may be mentioned also that in the region where the true power is 0-9 or greater, which is 
probably the most important in practice, Patnaik’s approximations generally tend to over- 
estimate the power, whilst the present method tends to under-estimate it. Some indication 
of the practical importance of the difference between the approximations to the power 
function can be obtained in cases where the true power is known, by using the approximations 
to provide estimates of n, the sample size required in order to afford a test of assigned power, 


— 








if 


_— 


— 




































































G. P. SruirrTo 349 
Table 1 (a). Power function for m=18, n= 12. 
Significance level a= 0-10 in the two-sided test 
Approximats values derivec. through the angular transformation are shown in parentheses. 
P, 
0-1 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 
P 
2 N 
0-9 1-000 0-997 0-975 0-907 0-773 0-574 0-356 0-170 0-088 
(1-000) | (0-995) | (0-974) | (0-917) | (0-800) | (0-619) | (0-398) | (0-197) | (0-100) 
0-8 0-997 0-972 0-882 0-732 0-494 | 0-289 0-144 0-094 0-202 
(0-995) | (0-965) | (0-882) | (0-733) | (0-533) | (0-326) | (0-165) | (0-100) 
0-7 0-977 0-880 0-722 0-487 0-272 0-136 0-092 0-159 0-393 
(0-974) (U°882) (0-714) (0-501) (0-298) (0-153) (0-100) 
0-6 0-918 0-742 0-515 0-287 0-142 0-099 0-155 0-320 0-610 
(0-917) (0-733) (0-501) (0-290) (0-159) (0-100) 
0-5 0-796 0-534 0-309 0-156 0-100 0-156 0-309 0-534 0-796 
(0-800) (0-533) (0-298) (0-159) (0-100) 

0-4 0-610 0-320 0-155 0-099 0-142 0-287 0-515 0-742 0-918 | 
(0-619) | (0-326) | (0-153) | (0-100) 
; O03 0-393 0-159 0-092 0-136 0-272 0-487 0-722 | 0-880 0-977 

(0-398) (0-165) (0-100) | | 

0-2 0-202 0-094 0-144 0-289 0-494 | 0-732 0-882 0-972 0-997 | 

(0-197) | (0-100) | 
0-1 0-088 0-170 0-356 0-574 0-773 0-907 0-975 0-997 1-000 
(0-100) | 
Table 1(6). Power function for m= 18, n= 12. 
Significance level «= 0-02 in the two-sided test 
Approximate values derived through the angular transformation are shown in parentheses. 
Rae? Be ec ate j = 

-, | | 

\ 0-1 0-2 03 | O4 0-5 0-6 0-7 | O8 0-9 
| Py . 

p— Ma Se 
| O09 0-998 0-976 0-902 0-752 0-532 0-290 0-103 | 0-017 0-006 

(0-996) | (0-971) | (0-897) | (0-759) | (0-564) | (0-353) | (0-173) | (0-060) | (0-020) 

0-8 0-961 0-882 0-694 0-490 0-265 0-115 0-035 0-016 0-055 
(0-971) | (0-870) | (0-693) | (0-476) | (0-275) | (0-127) | (0-046) | (0-020) 

0-7 0-909 | 0-714 0-460 0-248 0-111 0-041 0-021 | 0-050 0-177 
(0-897) | (0-693) | (0-453) | (0-248) | (0-111) | (0-041) | (0-020) 

0-6 0-767 0-500 0-262 0-111 0-041 0-023 0-048 | 0-138 0-361 
(0-759) | (0-476) | (0-248) | (0-107) | (0-039) | (0-020) 

0-5 0-625 0-293 0-124 0-045 0-022 0-045 0-124 0-293 0-625 
(0-564) | (0-275) | (0-111) | (0-039) | (0-620) | 

0-4 0-361 0-138 0-048 0-023 0-041 0-111 0-262 0-500 0-767 
(0-353) | (0-127) | (0-041) | (0-020) | 

0-3 0-177 0-050 0-021 0-041 0-111 0-248 0-460 0-714 0-909 
(0-173) | (0-046) | (0-020) 

0-2 | 0-055 0-016 | 0-035 | O-115 0-265 0-490 | 0-694 | 0-882 0-961 
(0-060) | (0-020) | | | 

0-1 0-006 0-017 | 0-103 0-290 0-532 | 0-752 0-902 | 0-976 0-998 
(0-020) | 

L 















































350 Approximations to the power function of the ‘2 x 2 comparative trial’ 


Table 2(a). Some values of the power function for the two-sided test for m=n=15 


























a=0-10 a=0-02 
P, P; Approximate Approximate 
Exact value from Exact value from 
value angular value angular 
| transformation transformation 
0-3 0-4 0-141 0-156 0-034 0-042 
0-6 0-8 0-306 0-334 0-112 0-133 
0-1 0-3 0-389 | 0-409 0-149 0-180 
0-2 0-7 0-896 0-893 0-680 0-713 
0-05 0-5 0-916 0-922 0-736 0-770 
0-1 0-6 0-919 0-926 0-739 0-778 
} 0-2 0-8 0-974 0-970 0-872 0-885 
| 0-1 0-7 9-980 | 0-978 0-894 | 0-910 
| | 








Table 2 (b). 











P, P, 
——n 2 
0:05 0-3 
0-1 0-4 
0-3 0-7 
0-2 0-6 
0-1 0-5 
0-2 0-7 
= — - 


Table 3. 


| 
| 


Table no 


1 (a) 

1 (b) 
| 2 (a) 
2 (b) 
Totals 
L 


Some values of the power function for the two-sided test for m =n = 30 














No. of cases in 
which Patnaik’s 
first. approxima- 
tion is nearer to 
the exact value 








No. of indecisive 
cases (equally 
discrepant, or 
indeterminate 

owing to 
rounding-up) 





a=0-10 a=0-02 

| . 

| Approximate | Approximate 
Exact | value from Exact | value from 
value angular value angular 

transformation transformation 
0-884 0-864 0-631 0-662 
0-885 0-878 0-691 0-686 
0-937 0-939 0-807 0-805 
0-945 0-948 0-839 0-828 
0-977 0-974 0-902 0-897 
0-993 0-993 0-965 0-961 


Comparison of approximations to the power function of the 2 x 2 comparative trial 


| 
No. of cases in 
which the angular 
transformation is 
nearer to the 
exact value 


| 

| 

we cinerea . 

6 28 38 | 

6 36 30 

0 (1) 3 (3) | 13 (12) 

0 (0) 0 (1) 12 (113) | 
Se ee le esas A Ee as 
12 67 93 
au. I : | 





Bracketed figures in the body of this table refer to Patnaik’s second approximation. 











B,w 
mat 
cont 


app 
left. 
to t 
test 
foll 


He 


In 





~—<_ ea 


an 


ion 








rial 


G. P. Srnurrro 35] 


8, when a particular significance level, a, has been chosen. In his Table 7 Patnaik has esti- 
mated n for selected cases of this kind using the two-sided test and a = 0-10 and 0-02. Table 4 
contains his results together with those obtained by using the angular transformation 
approximation. In order to obtain a simple expression for n, using this approximation, the 
left-hand term in expression (6) may be neglected, since it makes an appreciable contribution 
to the power only when the power is rather low, unless « is larger than is usual in significance 
testing. Making the relation between # and w,, as for 2 and w, defined in equation (4), it 
follows that for the two-sided test 

fb = WY, — Uy. (7) 


Hence, using equations (2) and (3), it is seen that 


ha . . 
a 4(arcsin ,/P,—aresin ,/P,)*/(u,, —u,)*. (8) 


In the particular case when m =n, therefore, n is given by 
= 3(u,—%,)"/(arcsin ./P,—aresin /P,)*. (9) 


This formula has been given by Paulson & Wallis (1947) with reference to the single-sided 
test, and they have provided a nomogram for determining sample sizes in the one-sided case, 
which can, of course, be used for a two-sided experiment if their 2 is taken as one-half the 
risk of the error of the first kind which can be tolerated. Using (9) on the cases of Patnaik’s 
Table 7, the values in Table 4 are obtained. 


Table 4. Estimation of the sample size from Patnaik’s first approximation 
and from the angular transformation approximation 






































Truc power* Patnaik’s Estimate of n 
= estimate of n from equation (9) 
True 
sample P, P, 
— For For For For For | For 
a=0-10 a=0-02 a=0-:10 | «=0-02 | a=0-16 | 2=0-02 
15 0-05 0-5 0-916 0-736 11 ll 15 14 
15 0-1 0-6 0-919 0-739 13 12 15 14 
15 0-1 0-7 0-980 0-894 13 12 ; 16 15 
30 0-05 0-3 0-884 0-631 26 23 33 29 
30 0-1 0-5 0-977 0-902 27 26 31 31 
30 0-2 0-7 0-993 0-965 28 28 | 31 31 
| 











* The power of the two-sided test. 


The angular transformatior: (1) is, of course, a particular case of a type of transformation 
recently considered by Anscoinbe (1948), and it is not difficult to show, using his methods, 
that for finite numbers of chservations and P+0-5, the expectation of x is not exactly 
arcsin,/P, but differs from it by a function of n and P, which increases as n decreases and 
as | P—0-5 | increases. The higher moments of x exhibit rather similar departures from those 


of a normal variate. It is to be expected, therefore, that the angular transformation approxi- 


Biometrika 36 23 








352 Approximations to the power function of the ‘2 x 2 comparative trial’ 


mation will be unlikely to remain satisfactory for smaller n and extreme frequencies, but on 


the present evidence it appears to be satisfactory for values of n and P which are of interest 
in many practical problems. 


Grateful acknowledgement is made to Prof. E. S. Pearson for his interest and for helpful 
suggestions as to the framing of this note. 


REFERENCES 


AnscomBE, F. J. (1948). The transformation of Poisson, binomial, and negative-binomial data. 
Biometrika, 35, 246. 

BARNARD, G. A. (1947). Significance tests for 2 x 2 tables. Biometrika, 34, 123. 

BARTLETT, M. 8. (1937). Sub-sampling for attributes. J.R. Statist. Soc. Suppl. 4, 131. 

EISENHART, C. (1947). Inverse Sine Transformation of Proportions. Chapter 16 of Selected Techniques 
of Statistical Analysis by the Statistical Research Group, Columbia University ; edited by Eisenhart, 
Hastay and Wallis. New York: McGraw-Hill Book Co. Inc. 

FIsHEer. R. A. & YATES, F. (1938). Statistical Tables for Biological, Agricultural and Medical Research, 
3rd ed. 1948, p. 56. Edinburgh: Oliver and Boyd. 

PaTNalk, P. B. (1948). The power function of the test for the difference between two proportions in a 
2x2 table. Biometrika, 35, 157. 

PauLson, E. & Wats, W. A. (1947). Planning and Analyzing Experiments for comparing Two Per- 
centages. Chapter 7 of Selected Techniques of Statistical Analysis by the Statistical Research Group, 
Columbia University; edited by Eisenhart, Hastay and Wallis. New York: McGraw-Hill Book 
Co. Inc. 


Pearson, E. 8. (1947). The choice of statistical tests illustrated on the interpretation of data classed 
in a 2x2 table. Biometrika, 34, 139. 








on 
‘est 


ful 


ata. 


ues 
art, 


rch, 
na 
-er- 
up, 


00k 


sed 


[ 353 ] 


THE DISTRIBUTION OF ‘STUDENT’S’ ¢ IN RANDOM SAMPLES OF 
ANY SIZE DRAWN FROM NON-NORMAL UNIVERSES 


By A. K. GAYEN, St Catharine’s College, Universit; »f Cambridge 


1. IytTRODUCTION 


The effect of universal non-normality on ‘Student’s’ ¢ has been studied so far either by way 

of particular numerical examples or by approximations to its sampling distribution based 

on the iurge sample assumption. E. S. Pearson (1928, 1929) has shown in his experimental 

investigation that the effect of universal ‘excess’ and of ‘skewness’ on ‘Student’s’ ratio z 

(which is related to ¢ by ¢ = z,/(n—1)) may be considerable. He has also furnished a small 

table, based on experimental results, showing some actual values of the probability integral 

for the ratio in samples of 2, 5, 10 and 20 from a few universes with specified values of ‘skew- 

ness’ and ‘excess’. M.S. Bartlett (1935) obtained some theoretical results for approximately 

representing the distribution of ¢, taking into account the universal ‘skewness’ and ‘excess’. 

The approximation is based on the assumption of large samples and is not very satisfactory, 

for in that approach, as he himself has also pointed out, the term ‘approximation’ is perhaps 
misleading. Still from the form of the expression obtained, it has been observed that the 
effect of the ‘skewness’ A, in the original distribution is the more serious and that of the 
‘excess’ A, is small. This, of course, confirms Pearson’s experimental results, but the resulting 
expression cannot clearly furnish a quantitative measure of the corrections to be- applied 
to normal theory probabilities of t, when the sample size is small. R. C. Geary (1936) obtained 
an expression for the distribution of tin samples of any size drawn from aslightly asymmetrical 
universe. It consists of two components, one being the ‘normal theory’ ¢ and the other a 
term in A,, which has been called the ‘corrective function’ due to it, He has also given a table 
of corrective tail-area probabilities for some representative values of n. In a recent com- 
munication (1947) he has given from a different approach an approximate formula for the 
frequency of t, correct to n-*, n being the size of sample. From this and analogous results on 
non-normal Fisher’s z, he has shown, by a few illustrative examples, that the inferences 
drawn from the standard tables may be seriously in error even in some cases where the 
parent is not considerably non-normal. He has accordingly pointed out the primary import- 
ance of testing for normality from the available samples and has suggested that when 
universal normality cannot be assumed, the standard tables should be corrected by using 
the sample estimates of A, and A,, etc., in conjunction with the theoretical results deduced 
therein. But so far as the frequency of ¢ is concerned, satisfactory measures of probabilities 
for small samples cannot be obtained from his results, since they are mainly based on large 
sample assumptions. Also the first few terms of his result, as far as n-', agree with those of 
Bartlett, which, as has been pointed out, are not very satisfactory. 

The purpose of the present investigation is to obtain the form of the corrective terms due 
to population cumulants Ag, Ay, etc., in the frequency function of ¢ for any size of sample. 
For this the parent population will be specified by the Edgeworth series. It has been proved 
rigorously by Cramér (1928) that this series gives a real asymptotic expansion of any universe 
f(x), in powers of vy *, where v, is the number of sources of ‘elementary errors’. Inclusion of 


23-2 








354 Distribution of ‘Student's’ t from non-normal universes 


terms of order v2, vp *, v1, ..., ¥9 * gives rise to the first, second, third, ..., (r+ 1)th approxima- 
tions to the law of error. We shall consider mainly the third approximation, including terms 
in Ay, A, and A, but shall also deal with the fourth approximation which takes in A;, A3A,q 
and A3. 

Probabilities of values of t, obtained from the derived formula, for the parent specified by 
the third apy -ximation, are in satisfactory agreement with the experimental determina- 
tions made so far, even in cases where the sampled populations are represented by the 
exponential curve or a very skew Type III curve of Pearson. Accordingly, it appears that 
the derived expressions for ‘Student’s’ t may perhaps provide quite satisfactory estimates 
of the probabilities, for a fairly wide class of non-normal universes, especially when the 
sample size is not too small. 


The values of the cumulants of the population have been assumed to be known, and the 
q ‘stion of their estimation has not been considered. 


2. JOINT DISTRIBUTION OF THE SUM AND THE SUM OF SQUARES 
OF DEVIATIONS OF ” SAMPLE OBSERVATIONS 


We shall first of all obtain the distribution of ‘Student’s’ ¢, with particular reference to the 
parent population being specified by the third approximation to the law of error. This 
will give us the corrective terms due to A,, A, and A3. It will be shown afterwards that 
similar corrective terms due to A;, A,A, and A can be obtained by the same methods when 


the parent population is represented by the Edgeworth series up to the corresponding 
terms. 


Let us then consider the parent population to be specified by the third approximation, 
correct to v5?, 


A; Ay 10AG 1 = 
f(a) = P(x) — J P(x) + FPO (x) + => O(2), (2-1) 
3! 4! 6! 
where A, and A, are the measures of universal skewness and excess, 


“een 
Oe) = Teme (2-2) 


the normal function in standardized form, and 


we) = (E) 6a) = (-17 He) ga), (2:3) 


where H,(x) is the well-known Hermitian polynomial of degree p. 


Let x,, %g, ...,%, be nm independent sample observations drawn at random from the universe 
(2-1), and let us adopt the notations 


S.=X2,, S,= Day (2-4) 
1 1 
n S2 
and 8, = & (2, —2)* = 8-3 = (n— 1) 84, (2:5) 
1 


where s? is the estimated variance. 





ion, 


2-2) 


2-3) 


erse 


2-4) 


2-5) 


A. K. GAYEN 355 


By the use of Bartlett’s method (1935) the joint frequency density of (S,, 8.) (which is 
also the joint frequency density of (S,, Sj), since the Jacobian of transformation is unity) 
may be written in the form 


9,(S,,S;) = W(n—-1) — Ss {D3W (n+ 5)—6D,D,W(n+3)} 


+a {D4 W (n+ 7) —12D3D,W (n+ 5) + 12D3W (n+ 3)} 


2 
+ = (nD8W (n+ 11)—6(2n +3) Di D,W (n+ 9) + 36(n +4) D3 D3W (n+ 7) — 120D3W (n+ 5)}, 


2-6) 
when terms in A; other than in A,, A, and A? are neglected. In (2-6) 
eSti2n ) / Shns—2) e182 
W(n 5) = —} { 
~ Wem) | zine D(}n9) 
etSi ( §’ — X #(ms—2) 
e~#* (iS, — Si/n)' (2-7) 
~ Al(2an) x x Qin (dni) ’ 
0 0 
= ,==—- 2-8 
and D, as,’ D, as, (2-8) 


We now proceed to evaluate the derivatives involved in the expression (2-6). It is possible 
to transform the product operators of the form Dy! D3? (where v,, v, = 0, 1, 2, ...) into functions 
of the single operator D,. 

For any typical function W(n), defined in (2-7), where nj stands for (n—1), (n+ 1), 
(n+ 3), ..., it can be shown that 


Dn DEW nt) = | —— | Deas) . (":) Dp W (ni —2)+ (3) Dp W(ni—4)—... 
+(-1) (’ *) Dew (my — 2r)+...4+(—1): DPW (ng — 24) |. (2-9) 
By differentiating W(nj) successively with respect to S, we get 
waar S, 
D,W(n}) = (—1) W(ni- — 


D?W (nj) = (—1)2 W(n 


4) {(8 

On 
YW (- o— 6 a =) 
DW (ng) = (—1)8 W( <-0f@) ar "> )() 


sWia'\ « (.. a | S; 6 Sy 2 S. be 3 =)' 
sialic te (2) ~ wml Gh acaacate i 


and so on. 
It will be observed that the numerical coefficient’ are those of the Hermitian polynomials 


3) 


Rin os 0 gn, OS 








356 Distribution of ‘Student’s’ t from non-normal universes 


So we can write, in general, 


Dy¥ (ng) = (— 1) W(ni — 20) (2) -5 deo =)" (=) 


2.1!(n4—2v) n 


—-he- 2) (v—3) S.\"-4 /S,\2 
Fa See) (x) ~~] eo 








For the present case as we have specified the population by the third approximation to the 
law of error we require the values of Di, for v = 0,1, 2, ... up to 6. 


Now using results (2-9) and (2-10) in (2-6), we obtain, after some simplification, the 
following expression for the joint frequency function of S, and S,: 


9x(S,, 8) = W(n—1) [: +H (>) + 3( =) #1 =) 
Ho) of9) nf) HEN] ALA] noe) 


saeco) Joo LEY -20 905) Fatal 


| min +1)(2)'-30m—-1) ] +67 Heel 5 (= “) \). (2-11) 


where H,(S,/n) is the Hermitian polynomial of degree v in (S,/n), and W(n—1) is given by 








be) 
(2-7). As a check it was found that, to the required approximation, | g,(S,, S,) dS, gave the 
0 


standard formula for the distribution of the sum of n sample observations. 


3. THE FREQUENCY FUNCTION OF ESTIMATED VARIANCE 


Integrating (2-11) for S, between the limits —0oo and 00, we find, for the frequency of S,, 
after some simplification, 


ml = V(S — Ay aa of_ S} ¢ S» | 
91(*,8,) = V(Sq,(n hailia Warne aat 
hn “s 2) _ Si 383 38, _ || 3. 
¥ fan ~~ 2) Sams mol) wal) (na) maa eC (31) 


whereV (S,,( — 1))isthe frequency density of normal §,. For the distribution of s? = S?/(n — 1), 
this becomes 


9x(*, (n— 1) 8%) (n— 1) d(s%). (3-2) 


Note that if we introduce the operator D,. = —._———.,, the frequency function (3-1) can 
a[S./(n—1)} 
be put in the form 
As ns 9) 2 y9 (n—2) @.s 
Vin 1) + 57 Dia V(n + 3)—-3, 48 Stn 2 1 Pee 2V(n+5), (3-3) 


where S, has been omitted from the V’s. 





Pu 
W 


10) 


the 


the 


— i 
ee 
a 


an 


A. K. GAYEN 357 
4, THE DISTRIBUTION OF ‘STUDEN%’S’ t 
4 : 
Put 8S, = (74) tin (2-11)and integrate for S, from 0 to 00, when we find that a typical term 
W(n—- 1) Sp: S} dS,dS, becomes 
nin. Qhrrt2r) DL }(n +1, + 2r9)] t dt 

Jaa ED) 1+ F[(e— ree 
It will be noticed that (4-1) gives the normal theory t for r, = r, = 0. We shall denote the 
distribution of ¢ by p(t). Thus we get by using (4-1) and after simplification 

T'(4n) 1 {3(n— 1) t—(2n—1)®} 

Val(n— DIT R@—- DP +P/(n—- DD} *“%6(m—1) (2n7) [1+ B/(n— 1) 





(4-1) 


p(t) = 





socdisaae P[4(n + 2)] {3(n — 1) —6(n + 1) +(n+1)#} 

*24n J/[m(n—1)] T[4(n +3) [1+ @)/(n— 1) 
3(n — 1)? (2n + 11) —9(n— 1) (n+ 3) (2n—1) 2 

— 3(n + 1) (n+ 3) (2n + 13) t+ (n+ 1) (n +3) (2n 45) 04 
144n(n — 1)? Jr TL 4(n + 5)] [14+ B/(n— 1) 


= Polt) + AgPa,(t) — AgPa, (t) + AZ Prxl?)- (4-2) 





P'[}(n + 2)] 





+23 


Obviously po(é) is the normal theory ¢ for a sample of size n, and p,,(¢), p,(¢) and p,3(t) are 
the corrective terms due to universal Ag, A, and A3. 

As a check on the above expression for ¢, moments may be calculated directly, about 
t = 0 and to n-*; they are found to be 


As 3 


2 2 
y(t) = +7 (1+A§) + 3 (8—Ay) +... 


7A, 15 
3(é) = ——34 1 +— +... 
Halt) ial asd ); 


2 ] 
y(t) = 34+ : (9 — Ag+ 14A3) + . 5 (102 — 830A, + 1203) +... 
e 


(4-3) 





These are in agreement with the asymptotic formulae for the moments of ¢ for any universe 
as obtained by Geary (1936), up to first power of A, and second power of Ag. All these cor- 
rective terms when integrated between the limits —0o and oo contribute nothing towards 


the integral, as may be easily deduced from the fact that | p(t) dt = 


—D 


5. THE DISTRIBUTION OF ‘STUDENT'S’ ¢ FOR THE PARENT SPECIFIED 
BY THE FOURTH APPROXIMATION TO THE LAW OF ERROR 


Let us specify the parent universe by the fourth approximation to the law of error, correct 
up to vo ', given by 


“A 


f(x) = b¢z) —A3gem(x) +19 +e A $062) — 8 0) — a Oe) — 28028 2), (5-1) 








358 Distribution of ‘Student's’ t from non-normal universes 


It is of interest to know the distribution of s? and ‘Student’s’ ¢ in random samples from this 
universe. We can arrive at these results by following a similar procedure, and the expressions 
corresponding to (2-6), (2-11) and (4-2) will have additional terms involving A,, A,A, and a3. 
We shall omit these intermediate steps and give below only the extra terms in the joint 
distribution of S, and S, and in the distribution of ¢. Thus corresponding to (2-11), the joint 
distribution g,(S,,.S,), of S, and S, will be given by 


galS1%o) =9y(Sqs Se) + W(n— [Ss {7(= t)+ 10(=2 ) 4,(=2) $15 an ; (=) #(-)| 
+Mians {{ »(=1)'+ 3(3n + 4) ("; ) +21(n+4) (=) +3(3n+ 32) (=! ‘| 


* a 1) (3 )[ scan 1) (3 1)’ + 6(7n? + 43n + 20) (=)'+9(an+ 39n +28) (=)| 


: Se 8; s 
2+ 58 9(3n3- 2 99) |= 
+ospecnls 2)’ [ snctne 5 n+75)(—) + 9(3n3 + 46n? + T1n 4 20) (=) | 








aes, Ses St ‘Se P 3 ° S, \ 
(n +3) (n+ 1) (n— ES, | (an + 53n +129 +98) (=) | 
m8 (F49(%)" ania a(S)! e2700 8. 
+7 male (= +9n(n+3)(=") +27(n-+ 1) (n+6)(=*) 
‘ ¢ 8, . or S, 
+ 9(n + 8) (3n + 32) =) + 135(n+6)(= 
_9 (S83 2 a ee re eo, (9:\° 
“aa )/ n+7)(2) + 3n(2n +21n+33) (=) 


‘S,\3 J 
+ 3(3n? + 51n? + 166n + 60) ey, + 3(23n? + 161n + 96) (=) | 
2 n 


27 


+ -———- (*:)' n*(n* + L4n +41) (=) + n(n? + 61n? +2710 +225) “1)' 
(n+1)(n—1)\n \n ; Nn, 


s 
+ 3(13n3 + 106n? + 131n + 30) (=) 
n 
9n S,\3 /S.\38 
a re 3n3 + 71m? + 411 + 635) {— 
EIT Twa Ty (an) LMA Tat atte + 035)() 
+ 3(20n3 + 273n2 + 533n + 285) (3) | 
n 
+7 <A 4n® + 43n? + 1182 + 115 8 5-2 
\ fen &. BV 1) uy . orZ 
(n+ 5) (n+3)(n+1 aan ae +o n+ 115)(— (5-2) 


Since the expressions appearing with A,, A,A, and Aj in (5-2) are odd functions in S,, the 
integral of go(S,,S,) with respect to S, will be iadeomediel of these cumulants, so that the 
distribution of s? will be the same as that obtained in (3-2) and (3-3). 





(; 





A. K. GAYEN 359 


For the distribution of t we proceed as before using (4-1), and obtain after simplification 
(p(t) being given by (4-2)) 
i {{n2 + (n — 1)?] 8+ 10(n— 1) B— 15(n— 1)? 
a(t) = ptt) +As 40[,/(2n77)] m(m — 1)? [1 + B/(m — 1) 
| (40n* + 258n* + 182n + 51) t? — 3(150n* + 896n? + 869n + 297) & 
eS Q + 15(n— 1) (68n2 + 31 1n + 237) & — 15(n— 1)? (25n + 59)t 
_ 144[,/(27n)] n(n — 1)8 [1 + 2/(n — 1) e+ 
| (64n4 + 952n3 + 3578n? + 301 1n + 996) & — 9( 144n4 + 191803 + 701 set 
+ 7550n + 2913) t? + 27(m — 1) (252n3 + 2821n* + 8554n + 7077) & 











3 | = 45(n—1 )? (54 ln? + 2086n + 3049) & + 945(n — 1)8 (3n+1 3)t 
bi. 1296[,/ (2am) } m(m — 1)*[ 1 + #/(m— 1) jh 
Polt) + AsPa,(t) a AgPrO + AS Paxlt) + AsPa,(t) + AsAgPr,a, (8) + AS P,3(t). (5-3) 


Polynomials in ¢ appearing in the numerator of the corrective functions are analogous in 
form to the Hermitian polynomials of the same degree. The form of the corrective term due 
to A, is simple but those due to A,A, and Aj are rather complicated. 


6. TATL-AREA PROBABILITIES 


We are interested only in the tail area of these corrective terms and so consider the integrals 


[roa or [pee 
-. te 


For the first of the four right-hand members of (4-2) the integral has been tabulated. For the 
next member the two tail areas will be equal in magnitude but opposite in sign. In each of 
the other two members we shall have equal areas at both tails. These integrals may be 
tabulated for given values of n and fp, either directly from their algebraic expressions or more 
conveniently by using the Incomplete Beta function table of Pearson. For the sake of 
completeness we shall give here the expressions for tail areas of all the components, intro- 
ducing Py(to), Pr(to), Pa,(ée) and P,3(t)) to represent them. ‘Thus, for one tail, the corrected 
probability integral P(t), will be given by 


P(tg) = Fo(to) + As Pag(to) — Ag Pag(to) + AS Pago): (6-1) 
—t. @ pe \ 
in which  P,(t,) = [ pelt) dt = i} p(t) dt = 1L,(" ; ~15)> (6-2) 





(6-3) 


Fe nae 

~to ipa 1 (n — 1) | 
= ( = = —_—_—_— 

Py (to) | Pall) dt |, Pa, dt 6,/(2n7) [1 + 2] (n a 1) kes 1)? 


which is evidently of order n~!. For calculation, however, it is more convenient to write 
{(2n—1)), jn-1 j n—l | n+l : 
P, (t,) = (\———— } 1 {- , it Fp yy 
a{lo) \6 /(2n7)} A | 2 \3 V(2nz)} wl & atte 
We also have 





3 3(n—1) 
. i Be iss a (ee _ Tn) \ ° (+t) ° 
P,,(to) = [ _ Patt) dt = |, Pa,() dt = (127-1 J[am— DI] Ne — DI +a) 
(6-4) 
_ (n—1) n—1 1 (n— 1) (n+ 2) n+1 1 (% + 4) (n— 1) n+3 1 
ST ts a a Te hc 


(6-4 bis) 








360 Distribution of ‘Student's’ t from non-normal universes 


and lastly, 





~t “0 (2n +5) T(4n) 
Faglte) = J _ Pad = | Pala = (ee Vir) T mi 
4. 22M=MM=N) g__32n+11)(n—1) 
+ (n+1)(2n4+5) © (n+1) (+3) Qn+5)° 




















[1 + #2/(nm — 1)]}i® (6:5) 
_ Ff(m—1) (2n+5)\ , (n—1 1) {(a—1)(2n?+5n+8) (n+t 1 
-[{ 72 175-3) 24n \L.( 2 3) 
(n—1) (2n?+ 5n+12)\ , (n+3 3) (e—Deeer ests) n+5 1) 
+| 24n (So 5 72n al 2 3) 
(6-5 bis) 


where ¢, is any typical value of ¢, and the transformed variate w, is given by 


1 


“0 14 8](n—1) 


and J, (}¥1, 3¥), is the Incomplete B-function as defined by Pearson. Obviously P,,(t)) and 
P,3(to) are of order n-'. For t) = 0(0-5)4 and n = 2,3,4,5,6,7,9, 13,25 and oo, values of 
Py(to), Paj(to), Pa(to) and Pyy(fo) have been tabulated (Table 1) with the help of Pearson’s 
Incomplete B-function table. The sample sizes are chosen to correspond to the degrees of 
freedom 1, 2, 3, 4, 5, 6, 8, 12, 24 and oo, as in Fisher’s table of z. 

It will be noticed that the effect of A, is very small, as has been inferred by various authors. 
The corrective tail area beyond | t,| = 3 (which is nearly the 4 and 3 % points in normal 
samples of 5 and 6 respectively) is at most 1 % when the size of sample does not exceed 5, 
about 0-3 % for samples of 10 and about 0-1 % for sampies of 20. The actual probability is 
obtained by multiplying these tail areas by the given value of A,, and they are additive to 
that of the normal theory t. 

As Geary points out, the effect of A, is rather serious, but so is that of A3. For a two-sided 
test the positive and negative tail areas will, of course, balance each other, but those due to 
Az will not. 

For a comparative study of the form of the frequency curves of the normal theory ¢ and 
those of the corrective functions for A;, A, and A, diagrams have been constructed for 
n = 3, 6, 13 and 25. They are shown in Figs. 1-4. As is to be expected, the curves of the cor- 
rective functions tend to those of the third, fourth and sixth derivatives of the normal 
function as ” tends to infinity. 

An example to illustrate the use of the tables has been taken from Neyman & Pearson’s 
(1928) paper ‘On the use and interpretation of certain test criteria’. The given values of A, 
and A, are not large and more or less satisfy A. L. Bowley’s (1928) criterion for moderately 
abnormal! curves. 

Example (Neyman & Pearson, 1928, p. 203). Records of weight in a large population of 
mice for males between 120 and 140 days of age show a mean value of 23-823 g., and for the 
frequency distribution £, = 0-086, 2, = 2-687. Can the group of six mice with the weights 
22-5, 26-0, 20-5, 24-0, 18-0 and 24-5 be considered a random sample from this population ’ 

Here A, = + 0-293 (inferred as positive), A3 = 0-086, A, = — 0-313, and we find t = — 1-038 
(=—ty, say) for n’ = 5 degrees of freedom. Considering both tails, the normal theory 
probability is P,{| t| >t)) = 0-3468, 








(6-5) 


Table 1. Comparative values of Py(ty), Py,(te), Pa,(to) and Pyy(to) for different degrees of freedom 
n' (=n—1, n being the size of sample). P(ty) = Po(ty) + As Pag (to) — Aa Pa, (to) + AS Prag (to) 


























































































































t Plt) Py, to) Py te) Paxlto) Po{to) P (to) Py, (to) Pag{to) 
0 
n’=1 a’ =2 
0-0 0-5000 0-0470 0-0000 0-0000 0-5000 0-0384 0-0000 0-0000 
0-5 0-3524 0-0589 — 0-0064 — 0-0066 0-3333 0-0495 — 0-0069 — 0-0066 
1-0 0-2500 0-0665 0-0000 0-0044 0-2113 0-0597 — 0-0027 0-0009 
1-5 0-1872 0-0622 0-0047 0-0147 0-1362 0-0563 0-0025 0-0118 
2-0 0-1476 0-0547 0-0064 0-0188 0-0918 0-0469 0-0047 0-0172 
2-5 0-1211 0-0476 0-0066 0-0195 0-0648 0-0375 0-0051 0-0179 
3-0 0-1024 0-0416 0-0064 0-0188 0-0477 0-0298 0-0047 0-0165 
3-5 0-0886 0-0368 0-0059 0-0176 0-0364 0-0239 0-0041 0-0145 
4-0 0-0780 0-0329 0-0055 0-0163 0-0286 0-0194 0-0035 0-0125 
n’=3 n’=4 
0-0 0-5000 0-0332 0-0000 0-0000 0-5000 0-0297 0-0000 0-0000 
0-5 0-3257 0-0431 — 0-0062 — 0-0056 0-3217 0-0387 — 0-0055 — 0-0047 
1-0 0-1955 0-0540 — 0-0034 — 0-0002 0-1870 0-0495 — 0-0036 — 0-0005 
15 | 0-1153 0-0513 0-0013 0-0098 0-1040 0-0473 0-0006 0-0084 
2-6 | 0-0697 0-0413 0-0035 0:0152 0-0581 0-0372 0-0028 0-0135 
25 | 0-0439 0-0310 0-0039 0-0157 0-0334 0-0266 0-0031 0-0139 
3-0 0-0288 0-0229 0-0034 0-0139 0-0200 0-0184 0-0027 0-0119 
3-5 0-0197 0-0169 0-0028 0-0114 0-0124 0-0127 0-0021 0-0095 
4-0 0-0137 0-0126 0-0023 0-0093 0-008 1 0-0088 0-0016 0-0072 
n= 5 nw’ = 6 
0-0 0-5000 0-0271 0-0000 0-0000 0-5000 0-0251 0-0000 0-0000 
0-5 0-3192 0-0355 — 0-0049 — 0-0041 0-3174 0-0329 — 0-0044 — 0-0035 
1-0 0-1816 0-0397 — 0-0035 — 0-0005 0-1780 0-0430 — 0-0033 — 0-0005 
1-5 0-0970 0-0440 0-0002 0-0074 0-0921 0-0413 0-0000 0-0066 
2-0 0-0510 0-0340 0-0022 0-0122 0-0462 0-0315 0-0019 0-0111 
2-5 0-0272 06-0234 0-:0025 0-0125 0-0233 0-0210 0-0021 0-0113 
3-0 0-0150 0-0154 0-0021 0-0104 0-0120 0-0132 0-0017 0-0092 
3:5 0-0086 0-0099 0-0016 0-0079 0-0064 0-0081 0-0012 0-0067 
4-0 0-0052 0-0065 0-0011 0-0057 0-0036 0-0050 0-0008 0-0047 
a = 8 n= 
0-0 60-5000 0-0222 0-0000 0-0000 0-5000 0-0184 0-0000 0-0000 
0-5 0-3153 0-0291 — 0-0037 — 0-0028 0-3131 0-0243 — 0-0027 —0-0019 
1-0 0-1733 | 0-0384 — 0-0030 — 0-0005 0-1685 0-0325 — 0-0023 — 0-0002 
15 | 0-0860 | 0-037] — 0-0002 — 0-0055 0-0797 0-0315 — 0-0004 0-0042 
2-0 0-0403 0-0277 0-0014 0-0094 0-0343 0-0230 0-0009 0-0073 
2-5 0-0185 0-0177 0-0016 0-0095 0-0140 0-0137 0-0011 0-0072 
3-0 0-0085 | 0-0103 0-0013 0-:0074 0-0055 0-0072 0-0008 0-0053 
3-5 0-0040 | 0-0058 0-0008 00051 0-0022 0-0036 0-0005 0-0033 
4-0 0-0020 0-0032 | 00-0005 0-:0033 0-0009 0-0017 0-0003 0-0019 
n’ = 24 a’ = @ 
0-0 0-5000 0-0133 | 0-0000 0-0000 0-5000 
0-5 0-3101 0-0176 —0-0015 — 0.0010 0-3085 
1-0 0-1636 0-0238 —0-0014 —0-0001 0-1587 
1-5 0-0733 0-0232 — 0-0003 0-0025 0-0668 
2-0 0-0285 0-0164 0-:0004 0-0043 0-0228 
2:5 0-0098 0-0090 0-0005 0-0041 0-0062 
3-0 0-0031 0-0041 0-0003 0-0028 0-0013 
3-5 0-0009 0-0016 0-0002 0-0015 0-0002 
4-0 0-0003 0-0006 0-0001 0-0007 0-0000 

















Distribution of ‘Student's’ t from non-normal universes 


362 
whereas the estimate of the actual probability 


P(|t| >t.) = 0-3447. 


If we consider the negative tail only, then P(t < —t)) = 0-1832. So that whichever tail is used 
tie conclusion that there is no reason to doubt the origin of the sample is obvious. 


0045 








—0-04L 


. 1. Showing the terms in the distribution 
P(t) = polt) + Aspa,(t) —AgPa,(t) + AZ paz(t) for n’ = n—1 = 2 degrees of freedom. 





Fig 
Explanation of symbols for all figures: 
G- o © Pal; 


o——_0—_—_-¢6 »,(/); 
I——— IK p, (1); 6 ——4————4 Pat). 








-003 








em 


Showing the terms in the distribution 
P(t) = Polt) +AsPa,(t) — Agra, (4) + AZ pat) for n’ = n—1 = 5 degrees of freedom. 


Fig. 2. 


A. K. GAYEN 363 




















004° 
is used 
— \ ~—003F 
= - 
40 —0-04L 
Fig. 3. Showing the terms in the distribution 
P(t) = polt) +A3Pa,(t) —Aypa(O) +AZpx(t) for n’ = n—1 = 12 degrees of freedom. 
’ 004 \ 
’ ‘ 
: ‘ 
’ \ 
-003} 
—004L 
Fig. 4. Showing the terms in the distribution 
~? p(t) = Polt) + AsPa,() —AsPa, (t) + AS Page) for n’ = n—1 = 24 degrees of freedom. 
=) 
40 
~* 7. ASYMPTOTIC CHARACTER OF THE SERIES FOR p(t) 


Pointing out some of the asymptotic properties up to order n-* possessed by the expression 
Po(t) + AsPa,(t) (being his earlier approximation for ‘Student’s’ #), Geary (1936) observed 
that for samples of moderate size, the probability 

{pol(t) + AsPaj(t)} dt 
might have quite an extended range of applicability, provided that at least the lower fre- 
quency constants A,, A; and A, are small. He considered this to be a matter for experimental 
investigation. 








364 Distribution of ‘Student’s’ t from non-normal universes 
The probability function 
P(t) = {pol(t) + AsPa,(t) —AgPa,(t) + AZP3(t)} 


obtained in (4-2) possesses similar asymptotic properties up to order n-!. For it represents 
the frequency density of ‘Student’s’ ¢ for samples drawn from any parent population if the 
samples are so large that the higher order terms in n-* for r>3 can be neglected. On the 
other hand, it gives the frequency density of ‘Student’s’ ¢ for samples of any size drawn from 
the particular parent population (2-1), when the A terms other than in A,, A, and A2 are 
negligible. From Geary’s (1947) asymptotic formulae for the cumulants of ¢ for any universe 
(his results 2-18), it is apparent that these higher A terms can only occur in terms in n-*” 
for r > 3, so that it is not unlikely that for samples of moderate size (¢) has a quite extended 
range of applicability, provided higher order cumulants A,, A,, etc., are small. 

The expression for q(t) in (5-3), being correct to n-#, will provide a closer approximation 
to the actual distribution of ¢ than the other two expressions, for samples from any universe 
if the terms in n~* and higher negative powers of n are negligibly small, as also for samples 
of any size for the parent population specified by (5-1) if A, ... Af are small. 


Table 2. Showing the experimental determinations of the true probability of t with their standard 


errors for various sampled universes together with the corresponding probability obtained 
from the frequency function p(t) of (4-2) 





















































| Sampled ; | : sos 
x aia end Experimental Probability 
my pap aneee wie Normal | determinations of |as estimated 
Rod * ec: wom i, “wes | the probability from p(é) 
— — Y | with their s.£.’s of (4-2) 
Ag=f, | A =f,-3 . ( 
E. S. Pearson 5 0-00 — 0-50 0-040 0-044 + 0-009 0-043 
(1929) 5 0-00 1-12 0-040 0-038 + 0-009 0-034 
5 0-00 4:07 0-040 0-029 + 0-008 0-018 
5 0-20 0-30 0-040 0-044 + 0-009 0-043 
5 0-50 0-73 0-040 0-042 + 0-009 0-048 
H. L. Rietz 5 0-00 — 0-63 0-067 0-058 + 0-007 0-071 
(1939) 5 0-03 — 0°55 0-067 0-070 + 0-008 0-071 
A. N. K. Nair 5* 2-00 | 3-00 0-040 0-062 + 0-011 0-072 
(1941) 67 4-00 | 6-00 0-030 0-088 + 0-010 0-088 
L 1 
* Type III curve. + Exponential curve. 


Approximate probabilities correct to n-! for a single tail will be obtained by applying to 
the normal theory value necessary corrections due to Ag, Ay, A? and Ax, AgAy, A’. But when 
both tails are used, corrections for A, and A3 only will furnish the same order of approximation 
since the contributions due to odd-order cumulants cancel each other, i.e. the use of p(t), 
which provides an approximation correct to n-! only, will actually lead for a two-sided 
test to results correct to n-?. This favours the application of our formula for p(t) to samples 


of moderate size from populations whose degree of departure from normality is not necessarily 
small. 











A. K. GAYEN 365 


Results of investigations made by various writers support the conjecture that p(t) or 
q(t) may have a quite extended range of applicability. Table 2 shows the values of Aj and A, 
of the sampled populations, the size of the samples considered, the normal theory probability 
P,(ty), and the experimental determinations with their standard errors, of the approximate 
probability P(t)) against their values obtained from p(t). The agreement is quite good, even 
in cases where samples of five or six only have been taken from markedly non-normal popula- 
tions. One of the two sampled populations of Nair is a very skew Type III Pearson curve, and 
the other is the exponential curve. These populations, as is well known, can hardly be 
represented adequately by the third or fourth approximation to the law of error. 

The agreement between experimental and theoretical results, especially for the two 
populations considered by Nair, may be partly due to the fact that the approximation yielded 
by p(t) is, as we have noted, not only correct to n- but also to n-, since both tails have been 
used for these cases. 

For certain values of Aj and A,, estimates of P(t,) at | t, | = 3, forsamples of 5 and 6, obtained 
from formula (4-2), are shown in Tables 3 and 4. Numerical values of A? and A, have been 
selected to include a fairly wide class of non-normal populations met with in practice. In 
spite of the fact that there is a fair agreement between the tabulated results and the experi- 
mental determinations, it is obviously unwise to assume that they necessarily furnish in all 
cases a satisfactory estimate of the true probability of ¢ for such small samples. 

The forms of the two approximations, namely (4-2), correct to n-', and (5-3), to n-4, are 
different from those which may be correspondingly obtained from Geary’s (1947) asymptotic 
expansion to n-® (his result (2-24)). He has obtained that expression by using Charlier’s 
‘Differential Series’ with the normal theory ¢ as the generating function, utilizing for the 
purpose his derived results for the first six cumulants of non-normal ¢. One of the advantages 
of Geary’s expression for ¢ is that it takes into account some of the higher cumulants and 
higher powers of A, and A,. Accordingly it may have an extended range of applicability. 

But the fact that it agrees, to order n-!, with M. S. Bartlett’s (1935) result suggests that the 
formula is not satisfactory, since the asymptotic character of the latter cannot be assumed. 
For Bartlett specified his parent population by the first three terms of the Edgeworth series 
which take no account of the term in A3, namely, 10A3¢(x)/6!. The estimates of true 
probabilities shown in Geary’s Table 2 cannot as such be regarded as satisfactory. 


8. PROBABILITY CORRECTIONS DUE TO HIGHER CUMULANTS OF THE 
PARENT POPULATION 


For the effects of A;, A,A,, A} we consider as before the probability integrals of the corrective 
functions and denote them by Py, (to), Pa,a,(to) and Pia(t)) respectively. We then have 


—t rs) 
Py (to) = — [ Py,(t) dt = ( Pra, (t) dt 


J- Jt 
(,_, 2(4n—3), , (2n?—2n+1) ) 
teat 8+ ani 
ea #2 - 
i —— 
40.n Jeem(1+ ; 


“(Sea A?) (SREP EE 


(2n?—2n+1)\, (n—-1 a 
+ aegimeia? }n.("5+ 1): — 





(8-1) 








366 


Table 3. Showing comparative values of 


near the 4% point (approximately) of normal theory t for samples of 5 (n’ = degrees of 


Distribution of ‘Student's’ t from non-normal universes 


ty 
P= 1— | plt) dt = 2P%(t)—AcPall) + ABPayl th 











freedom = 4) 
\a 

ht 0-00 0-20 0-25 0-50 1-0 1-5 2-0 4-6 

Ay \ 

— 2-0 0-0505 

—1-5 0-0479 0-0527 0-0539 0-0598 

—1-0 0-0452 0-0500 0-0512 0-0572 0-0691 

—0-5 0-0426 0-0474 0-0486 0-0545 0-0665 0-0784 
0-0 0-0399 0-0447 0-0459 0-0519 0-0638 0-0758 0-0877 
0-5 0-0373 0-0421 0-0433 0-0492 0-0612 0-0731 0-0851 
1-0 0-0346 0-0394 0-0406 0-0466 0-0585 0-0705 0-0824 
1-5 0-0320 0-0368 0-0380 0-0439 0-0559 0-0678 0-0798 
2-0 0-0293 0-0341 0-0353 0-0413 0-0532 0-0652 0-0771 0-1249 
3-0 0-0240 0-0288 0-0300 0-0360 0-0479 0-0599 0-0718 0-1196 
4-0 0-0187 0-0235 0-0247 0:0307 .| 0-0426 0-0546 0-0665 0-1143 
6-0 0-0081 0-0129 0-0141 0-0201 0-0320 0-0440 0-0559 0-1037 | 



































Table 4. Showing comparative values of 


near the 3°%, point (approximately) of the normal theory t for samples of 6 (n’ = degrees 


ty 
= ]— [ Pe) dt = 2[Po(t) — Ag Py, (to) + AS Prag (to), 




















of freedom = 5) 
\ 
Noa 
be 
\ 0-00 0-20 0-25 0-5 1-0 1-5 2-0 4-0 
Ay \ 
\ 
—2-0 0-0386 
—1-5 0-0365 0-0406 0-0417 0-:0468 
—1-0 0-0343 0-0385 0-0393 0-0447 0-0551 
—0°5 0-0322 0-0364 0-0374 0-0426 0-0530 0-0634 
0-0 0-0301 0-0343 0-0353 0-0405 0-0509 0-0613 0-0717 
0-5 0-0280 0-0321 0-0332 0-0384 0-0488 0-0592 0-0696 
1-0 0-0259 0-0300 0-0311 0-0363 0-0467 0-0571 0-0674 
1-5 0-0237 0-0279 6-0289 0-0341 0-0445 0-0549 0-0653 
2-0 0-0216 0-0258 6-0268 0-0320 0-0424 0-0528 0-0632 0-1048 
3-0 0-0174 0-0216 0-0226 0-0278 0-0382 0-0486 0-0590 0-1006 
4-0 0-132 0-0173 0-0184 0-6236 0-0340 0-0443 0-0547 0-0963 
6-0 0-0047 0-0088 0-0099 0-0151 0-0255 0-0359 0-0463 0-0879 





























oe 








s of 





= 
-1 


yrees 





A. K. GAYEN 367 


—ty a0 
Py 5, (to) == } Parga, {t) dt = | Praga, (t) dt 


J— 


in—1 in+1 in+3 n+5 
(%34) LJ , 4) — (143) L|-— , 3) + (M59) LA = 2) — (%y) Ls: 1) ’ 


ll 





(8-2) 
where (34), (%43), (%52) and (%,,) are given by 
n—1 
{ (n —1) B("> : 1)| ; 
Na) = | = (40n3 + 258n? + 182n + 51), 
("s) = | S55 Smad) | , 
n+l 
(B("* = 3)| 
panes st fad 3 208Rw2 a 1907 
(43) = \Sevemnt | (150 + 896n + 869n + 297 >, 
; sil $ (8-3) 
(sn("*3, »)| 
(252) = [96 Jam nt (68n* + 311+ 237), 





, < 
(%1 \ 96 ,/(277) n ] 
re —t, (co 
P. s(t) = | P a(t) dt = Paalt) li 
x J & 





= (M45) Lal 5) + (50) ful “5 t, 4) (00) gl” ==, 3) 


+ (N79) Lee > 2) wie (%g) LS > i) ? (8-4) 


- 


where 


[im—1) B(">*, 5)| 


(n,;5) = | | (64n4 + 952n3 + 3578n2 + 301 1n + 996), 


[a(*t* 4)) 
(n54) = \; —r 
| 





4 fy Ss 1 TH) 24 TEE 2 q 
| 228 (am ni (144n + 1918n +70LIin + 7550n + 2913), 


| 
B n uh 3 
se Se Ns on 





(% 3) = 96 (am) nt v3 + 2821 n? + 8554n + 7077), 
jsa("=", 2 
(n-) = | Sse 7em) : (541n* + 2086n + 3049), 
35B("+7 1 
(ns) = --— (3m + 13), 


with the usual notations of Complete and Incomplete B-functions. 
Biometrika 36 24 








368 Distribution of ‘Student’s’ t from non-normal universes 


The additional formulae (8-1), (8-2) and (8-4) of this section enable us to deduce the pro- 
bability of t when the parent population can be represented by the fourth approximation. 
The corrected probability P(t,), for a single tail, may be obtained from the formula 


P(ty) = Po(to) + As Pag (to) — Aa Pag(to) + A3 Paglto) + As P; (to) + AsAg Pagay(lo) + AS Paglto)- (8-6) 


It should be noted that for the negative tail of the distribution P, (tj) and P,;(t,) are positive, 
whereas P,,(f)) and P,,,,(f9) are negative, the reverse being the case with the other tail. 

For samples of 10 from such a population the probability at t, = — 2-262 (here | f)| is the 
5 % point of the normal theory ¢), will be given by 


P(—2-262) = 0-0250 + A,(0-0209) — A,(0-0015) + A3(0-0092) 
+Az{—0-0017) +A3A,(— 0-011 1) + A3(0-0221). 


If A, = 1, A, = 1, then for a Pearson curve, A, = 0-25, and in that case the probability is 
found to be 0-0642, whereas the estimate from terms as far as A? only is 0-0536. Thus for 
a single tail they differ considerably from the normal theory probability 0-0250. 


SUMMARY 


The theoretical distribution of ‘Student’s’ ¢ in non-normal samples of any size has been 
derived with reference to the parent population specified by a number of terms of the 
Edgeworth series. It contains, in addition to the normal theory frequency function of f, 
corrective terms due to the cumulants Aj, Ay, A? and A;, A3A,4, Az. It is assumed that the 
values of the population A’s are known. 

If the population can be well represented by the third or the fourth approximation (in 
which cases next higher A’s and higher powers of .., and A, are small), then the corresponding 
t-distributions will be accurate for any size of sample, as far as similar order terms in A’s. 

The two expressions for ¢, namely, the one including the terms u,-vo that in A2 and the other 
up to that in A3, are asymptotic to order n-! and n-? respectively, and as such they afford 
closer approximations ‘o the actual distribution of ¢ for large samples from any universe. 
The probabilities calculated from the former will be the same as those from the latter, if both 
the tails of ¢t are used. So it is not unlikely that for moderate size of sample, the former may 
have an extended range of applicability. The satisfactory agreement setween the theoretical 
values and the experimentai results of various writers appears to support this conjecture. 

Tail-area probabilities of the corrective terms for A;, A, and A? have been tabulated. 
An example is given to illustrate the use of the tables. Diagrams showing the frequency 
curve of non-normal t for certain values of n have been constructed. 

The probability corrections due to higher cumulants of the parent population have also 
been considered for a representative value of n. 


I wish to acknowledge my indebtedness to Dr H. E. Daniels for his kind advice and 
criticism in the course of my investigations, also to Dr J. Wishart for suggesting a number 


of improvements to the paper. My thanks are due to Mr D. A. East for drawing the 
diagrams. 





> pro- 
ation. 


lity is 
us for 


been 
of the 
1 of Z, 
it. the 


on (in 
nding 
\’s. 
other 
afford 
verse. 
f both 
r may 
‘etical 
ure. 
lated. 
uency 


e also 
e and 


umber 
ig the 


A. K. GAYEN 


REFERENCES 


Barttett, M. S. (1935). Proc. Camb. Phil. Soc. 31, 223. 

Bow ey, A. L. (1928). F. Y. Edgeworth’s Contributions to Mathematical Statistics. 
London: Royal Statistical Society. 

Cramer, H. (1928). Skand. AktuarTidskr. 11, 13, 141. 

Cramer, H. (1946). Mathematical Mcthods of Statistics. Princeton University Press. 

Epceworts, F. Y. (1905). Trans. Camb. Phil. Soc. 20, 36, 113. 

Epcrworts, F. Y. (1906). J.R. Statist. Soc. 69, 497. 

Grary, R. C. (1936). J.R. Statist. Soc. Suppl. 3, 178. 

Geary, R. C. (1947). Biometrika, 34, 209. 

Narr, A. N. K. (1941). Sankhya, 5, 393. 

NreyYMAN, J. & Pearson, E. S. (1928). Biometrika, 20a, 175, 263. 

Pearson, E. 8. (1928, 1929). Biometrika, 20a, 356 and 21, 259. 

Rietz, H. L. (1939). Ann. Math. Statist. 10, 265. 


369 


24 








[ 370 ] 


THE COMBINATION OF PROBABILITIES ARISING FROM DATA 
IN DISCRETE DISTRIBUTIONS 


By H. O. LANCASTER, Rockefeller Fellow in Medicine 


INTRODUCTORY 


1. Two common methods of combining probabilities from different experiments make use 
of the additive properties of y?. Thus the results of the various experiments may each be 
expressed as a standardized normal deviate or as its square. In these cases the probabilities 
are readily combined by the summation of x?. In other cases there may be obtained the 
probability, on the null hypothesis, that a result as divergent as the observed or one more 
divergent would occur. Such probabilities may be combined in a simple way by the use of 
the transformation x? = — 2iog, P. Nodifficulties are met in data from continuous populations 
nor in discrete populations, where many different observations are possible, such as occur 
with large samples. However, with discrete populations, as, for example, the binomial with 
low index or the fourfold table with small numbers, biases arise which may diminish the 
power of the tests. This is most easily seen in the case of the binomial if independent values 
of x2, that is, y? corrected for continuity, are summed. As these biases have not previously 
received much attention, we examine the problem in some detail. We suggest that, for the 
x? with one degree of freedom, only the crude ‘y? is suitable for summation. No obvious 
modification for the probability integra! transformation is available, but we suggest the use 
of the ‘mean value x?’ or an approximation to it, the ‘median value y?’. The former, when 
the null hypothesis is true; has an expectation rigorously equal to the theoretical value 
of 2 and a variance slightly less than the theoretical value of 4. 

We discuss, first, the case of the binomial with low index, where an enumeration of all 
possible events and their relative frequencies of occurrence under the null or other hypothesis 
is practicable. Then we pass on to the more difficult case of the fourfold tables where the 
number of possible events is greatly increased, even with the simplified conditions which 
we have selected. 

THE BINOMIAL 


2. Notation. In the discussion of the binomial, we shall take p and q to be the prob»bility 


of success or failure of an individual observation as specified by the null hypothesis, » the 
number of trials, m the observed number of successes, and p’ and q’ the corresponding 
probabilities in the actual population which is being sampled. We shall use P in the usual 
sense of the calculated probability of the observation and all observations more extreme, and 
P’ to be the probability of these more extreme observations alone, so that (P— P’} is the 
probability (i.e. the relative frequency of occurrence) of the observed result. Occasionally, 
as in (1), it will be necessary to use P as a continuous variable. In connexion with the pro- 
bability integral transformation, we define the mean value ,?, (x?,), and the median value 
x”, (xj2), as follows: we 
x2, = | (—2log, P)dP/(P— P’) 


~ 
= 2-—2{Plog, P— P’ log, P’}/(P— P’), (1) 
=—2log,3{P+P’}, if P’ sie 


= 2—2log, Fr’. if P’=0. 








TA 


e use 
th be 
lities 
1 the 
more 
ise of 
tions 
secur 
with 
h the 
alues 
ously 
r the 
vious 
e use 
when 
value 


of all 
hesis 
e the 


vhich 


bility 
n the 
iding 
usual 
, and 
s the 
ally, 
) pro- 
value 


(2) 


H. O. LANCASTER 371 


3. A note on the correction for continuity in the case of the binomial distribution. Yates (1934) 
made a definite, although indirect, statement on the use of his continuity correction for the 
combination of probabilities from different experiments in the following words: 

‘ P(x’) [i-e. our P(y,)] gives 0-1427, an excellent approximation, whereas P(x) gives 0-0612, 
which, though not in itself attaining significance, is less than half its true value; this would 
be exceedingly misleading if a number of such probabilities from different classes of experi- 
ment were to be combined.’ D. Mainland (1948) takes this to infer that x2, and not the crude 
x2, is to be summed, a fallacious practice already rejected by Cochran (1942). We may 
note that 

E(x) = E(|m—np | — $)?/(mpq) 
= E(m—np)?*|(npq) — E | m—np |/(mpq) + 1/(4npq) 
1+ 1/(4%pq) — (mean deviation)/(variance). (3) 


If np is integral, the expectation is less than that given above by a quantity equal to the 
frequency of the modal term divided by four times the variance. We have always here 
diminished | m—np| by }. This will result in a change of sign if | m— np | is less than }, but, 
in general, it is obvious that only a trivial and nonrsystematic difference will be made to 
the expectation by the alternative method of not diminishing |m—mnp| by } if it be less 
than }, and the method used here has rendered the algebra much easier. If neither n, np nor 
ngp be small, this bias due to the use of the continuity correction is trivial, but then the 
continuity correction would be unnecessary; in those cases for which the correction is said 
to be essential, the mean deviation may be of the same order of magnitude as the variance 
or, at any rate, not e small fraction of it, and so bias will result. We note, for example, that 
in the binomial (} + 4)? the expectation of x? is 0-25 if the null hypothesis be true. Many will 
regard this as an unnecessarily extreme case, but it is not as extreme as Mainland’s example 
(on his p. 54) which we quote in § 9 below. 


4. The power of the methods of combination of probabilities. In Table 1 A we have attempted 
to give some idea of the power of the tests, in the sense of Neyman & Pearson (1933), for 
a binomial population with index of 5. We note first of all that no single experiment can give 
a result significant at the 5 % level, so that we are led to some method of combination of 
successive experiments, and we have compared the results of (a) simple pooling of the 
successes and failures, (6) summation of the crude y?, (c) summation of the mean value x?, 
(d) summation of the — 2 log, P, (e) summation of x2, which we find are decreasingly powerful 
in that order.* We have taken p = 0-5 as the null hypothesis and have enumerated the 
possible events in populations with various values of p’ and have calculated the relative 
frequency of each event. We have then been able to calculate the frequency with which 
we may expect results significant at the 5 % level combining two, three or four experiments; 
in the rows indicated by the symbol 00, we have shown the limits to which results tend if 
very many samples are combined. Thus, we have been able to show that, if the null hypothesis 
were true in the population sampled, by combining two experiments it would be rejected in 
4-3 °% of cases using the summation of the crude x’, whereas if the p’ of the popu’ tion sampled 
were 0-25 or 0-75, then it would be rejected in some 25-2 % of cases and so on. If a x* has an 
expectation below its appropriate theoretical value of either 1 or 2, respectively, then we 


* In all cases we have used a two-sided test, e.g. where the null hypothesis gives the binomial ($+ 4), 
we have pooled together results with 0 or 5, 1 or 4 and 2 or 3 successes. 








372 Combination of probabilities 
may assume that in an indefinitely large number of experiments the value of x? obtained is 
below that of the 5 % significance level in practicaliy every set of experiments combined, 
and so we have assumed that no significant results will be obtained in these cases. However, 
if the expectation is above the theoretical value, then we have assumed that a summation 
of the x? from an indefinitely large number of experiments combined will always yield a 
significant result. Thus we find that, if p’ = 0-33, the expected value of (— 2 log, P) is 1-480, 
and so, if the experiment were to be repeated a thousand times we should have a x? of 
approximately 1480 with 2000 degrees of freedom and should have no ground for rejecting 
the null hypothesis. In this example we have scored the rejection rate as zero. If, on the 
other hand, the expectation of y? were above 2, as in the case where p’ = 0-2, we should expect 
every indefinitely long series of experiments to lead to the rejection of the null hypothesis. 
As was to be anticipated, simple pooling was found to be the most powerful method of 
combination. The crude x? and x2, were equally powerful in this particular group of popula- 
tions, whereas x? and the probability integral transformation were much less powerful and, 
in fact, were unable to detect considerable departures from the null hypothesis even with 
a large number of experiments available. Somewhat similar findings occur with samples 
from the populations (p’+q')'® when the null hypothesis specifies that p = 0-5, as can be 
seen from Table 1B. In these discrete populations we can only give arithmetical results in 
specific cases, since the use of approximate methods would remove the effect of the dis- 
creteness which is what we desire to study. We have spent some time on the binomial because 
the problems here are more easily studied than in the case of the fourfold table where the 
enumeration of cases becomes impracticable, and because the binomial is the limiting case 


of the series involved in the fourfold table, when the members of one row become indefinitely 
large. 


5. The probability integral transformation, x? = —2log, P. We discuss now in greater 
detail the transformation, x? = —2log, P, as introduced by R. A. Fisher (1932). A similar 


transformation was dealt with by Karl Pearson (1933) and later E. S. Pearson (1938). None 
of these authors appears to have considered the effects of discontinuity, but the recent 
increasing use of very small samples renders such an inquiry now necessary. This transforma- 
tion gives unbiased results when based on continuous distributions, but it will be shown 
that it must be used with caution in the case of discrete distributions. Under the null hypo- 


thesis, all values of P are considered to be equally likely, so that, if F is the distribution 
function of P, 


dF = dP. (4) 
Integrating, this gives 


F=P (0<P<}), (5) 
since the probability of obtaining P = 0 is 0. Then f(P), the frequency function, is given by 


f(P) = constant =1 (0<P<\1). (6) 


After the transformation, y = — 2 log, P, we find that y has a frequency function 


f(y) = sexp (—4y), (7) 


so that y is distributed as y* with 2 degrees of freedom. Thus, any number of probabilities, 
P, may be converted by means of this transformation to x? for 2 degrees of freedom, and, 
using the additive properties of the x? distribution, the x? may be summed together with the 
degrees of freedom in the usual manner. It is necessary, however, to note that P must be 





Table 1A 
ee  eshahility of @ single success 18 p 


mi. hac at the table 


ae ae. vA 


) 
3 
3 
n 
o 
~ 
© 
£ 
— 
bo 
& 
& 
ra) 
§ 
} 
° 
a 
=) 
3 ° 
a 


‘gyuoumtsedxe g 10 Z JO WOIFBUIGUIOD OY} 4OJ .X apnso oy} 8B 8y[Nsed OUTS OU} oad BX pus “eX + 





uoryeq00dxe OY} Sez;8OIPUl TJ « 















































































































































(11) pus ‘[eAe] gouvoyiusis % 
91983 o43 Jo Apog ou, 
jo sy[nses oy} Bururqui0s jo 


g 8 Buren poqoofos oq [ITs sisoyjodAy [[NU oy4 
*% g St [Ae] eouvoylusis eyy pus ¢-0 spunbe d yeyy St peyset sisoyjzodAy oy 


puv ,d s! ssooons e[Zuls @ JO 



























































Aqyiqeqoid eyy esoyss UOTFETO 


VI P198L 


vith 
ples 
1 be 
sin 
dis- 
use 
the 
pase 
tely 
ater 
ilar 
[one 
cent 


ma- 


yor Yzta Aouenbady 


eSeyusoied oY ‘SI 
4, esBo YOR UT 
dod @ wiosy oay jo sojdures Surmerp Jo 87[NS0r oy} SMOUS 91999 OL 


own 


ypo- 


19-9 | 1e6-€ | 808 | 684: | 169% LILe | 068-1 aX 
gsq-9 | 980% | SIF | O16-G | SEC e823 | 000% ay 
B16-g | 969% | OBIS | BOLT | O8FT L¥L-T | 096-0 J 201Z- = 
099-2 | OFFS | 000% | OFOT | PFI O9T-T | 000-T x epnig 
gore | eset | 160-1 | 998-0 | 18L-0 0se-0 | 09-0 x 
gv snoispa oY} fo suoppodxg (1) 
at <(eX)q | 00-001} 00-00T| 00-001 00-001} 00-001] 00-001} 00-9 o oo 
gare < XX | 91-96 | F2-6F | 64-08 ELLE | BOIL | Sh9 26°% ” t 
GISL < XK | 9-38 | 93h | 96-96 oral | 39-01 | LIS 96% ” £ 
166-9 < XX | 996L | 90-88 | 06-96 CI-9t | O6-1T | 69-9 08-F 2X jo uorsuruing (9) z |,X epnag 
Soe See ole 
az <(X)M | 00-001} 00-001) 00-001 0 0 0 0 ” er) 
L0g-91< XX | 68-9b | FOOT | FFP eLI 06-0 93-0 60-0 * i 
Z6o-zL< eX | S6F9 | 96-91 | FES 88S 08% 98-0 6£-0 af £ 
gsre < wXK | LOE | SLOT | 899 16% 8°1 LL‘O 68-0 | q*801z— Jo uorwurums (P) rd d °3013 — = 
ot <(%X)q | 00-001 00-001} 00-001 0 0 0 0 “ oo 
ssh6 < XX | 68:9% 79-01 | FPF ell 06-0 92-0 60-0 “ ¥ 
gIgL < XX | 69-02 go-el | gel 0¢-0 92-0 40-0 0-0 “ £ 
1669 < Xx | LOE | SLOT | 899 16-3 #21 LL0 68-0 aX jo uoryeurumng (2) Z xX 
> ea a caenatelasonesil ee 
g-9 Ayoexe youd | 00-001} 00-001) 00-001 00-001; 00-001} 00-00T} 00-9 ” 00 
sassaoons oZ-ST ‘S-0 | 98°86 | 98-08 | SL:19 | F91P PL62 | SLI | FIP i t 
saseooons gI-ZI ‘€ 0 | FFG | B89 | SI-'9F | 04°66 £602 | $26 ra ” g 
sossooons OI ‘6 ‘TO | I9-8L | 8S-L8 | FFG 6-1 | FHOT | O87 S1-% Burjood eydung (2) Z 
ejqissod synsox yuBOyTUBIS ON | 0 0 0 0 0 0 0 —_— I poyzeul JoBXG 
synsas quooyiubes fo aboyuaosag (') 
6-0 8-0 GL-0 
OT . ° ° 
Jere] % ¢ 48 wR eee | ee wis uorysuIqui0s jo poyyoW ror Buy4904 JO 
eouBoylUusIS JO UOLIOFI) ————— sents 5 4 jo ‘ON poywy 
poysey uoryejndod ey Ul jd JO ONIBA 
i ae ee ee 


tion 


*,X snolsea oY} JO suo1ye}00dxe OY} 
4Byy ‘4803 oY} JO emod ey} (1) smoys 
*sonbruyoe4 yUeIEZIP fq s8uimeip 9s0y4 


(4) 
(5) 
n by 
(6) 
(7) 


it be 


an 
*"UOIPBIVOdXS OY4 SEYBOIPUI F » 





“EX pue -X opnad oy} UBYY S4y[NSseI JUBOYIUTIS Jo adeyUeoI0d Jomo] A[ZYS YS @ SOAS ZYX 
*sqUOUILIOd xa 90143 PUB OMY JO UOIYBUIGUIOD JO} ,V epnid OY} SB S}][NSaI CUES OY4 SOAIZ ZN 


“qguoulliedxe a[ZUIs ¥ Ul sy[Nsel JUBOYIUZIS Jo odByUGOIEd OUTES BY} GAID SPpOYJOU [TW “8970 AT 



























































| | | 
C866 bOE-E | gS =| 690-3 FE6-1 wX 
SZI-¢ LOPE LI¢-3 831% 000% | ry 
S30-F E1g-Z | SELI 90F°I 966° | d “301 Z — = ,X 
0SZ-€ 000% =| 098-1 060-1 000-T | 2X apni 
CZE-% OFE-T | 398-0 0¢9-0 | £890 | a 
2X snoiima ayj fo suoynpodaxg (i) | 
| | 
| | i: 
al <(X)A 00-001 00-001 | 00-001 00-001 00-¢ oc 
SI8-L < WX E29 og | sit | 600 | OFF . € 
166 < XX 08-0¢ BLES IL-1 | 9-9 | Z0-¢ eX Jo uorwurung (9) j eX epnig 
agers | | 4 | 
4B < (eX) 7 00-001 00-001 0 0 0 i | 0 
Z6S-Z1< XK PILE GO-FI re L¥-I 86-0 “ e | | 
88F-6 < Wx LG-FE 9F-ZI 99°F 43-3 69-1 d *30[Z — Jo uoryeurumg (p) 3 F *BO[Z — = 2X | 
al <(X)a 00-00T 00-00 0 0 0 e 0 | 
CI8-L < Xx PI-LE £0-F1 IF LET 86-0 Woks € a7 
166-¢ < WX 90S G9-L 0g-3 €I-1 | LL-0 X jo uoryeurumng (a) 3 | aX 
—= | | 
¢-0 Ajjoexa youd! 90-001 00-001 ‘| 00-001 | 00-001 00-¢ i Oo 
SossadoNs IE-TZ ‘6-0 ¢E-08 SI-8F | GL-LT | HL 82°F - oe g 
sasse0ons OZ-GT ‘S-0} BL-19 bL6Z =| =SLCI | (SI vi | Burjood ejdung (0) rd 
sessooons QT “6 ‘1 ‘0 i rn 4 o!) | SL cre | _ I poyyeul yoexy 
| synses guooifrubis fo abpzuaoseg (1) 
| ! 
| | 
GLO 99-0 9-0 | 9-0 | 
oat % ¢ ye JO ¢Z-0 | 410 €§-0 10%0 | JOChO | 0 sjueur | . 
eouvoytusis jo | | |  uorwuIqui0s jo poyxeyy =| = -edxe | proces” 
uOLIO4LI) | jJO°ON | 
peyse} uolyeindod oy4 ut ,d jo onte, 
*¢-0 = yey} st stsayyodAy [nu ey, ‘sABM SNOLIBA UI WIOY3 


Suruiquios pue QO] =u jo sojdures Surmesp jo sznser oy 04 sozepor Mou 41 4BY3 4Ydeoxe ‘Vv eIGQe], UI poJopIsuOD Wo[qoid oY4 SeyBIASNI]II e[qBy OYJ, 


at A9*L 








| H. O. LANCASTER 375 


assumed to come from a rectangular population with end-points 0 and 1. Further, in con- 
tinuous populations the transforms of P and (1— P) both have a finite expectation. This is 
not so with discrete populations as can easily be seen in the case of the binomial, since there 
is a finite probability of obtaining a result 1 — P = 0, when — 2log,(1—P) = oo. If, moreover, 
a two-sided comparison is made so that we always sum the probabilities from the tail in which 
the observation lies, a further difficulty arises. In order to define which tail to use, we must 
fix some dividing point and sum from the lower tail if the observation falls below this point 
and otherwise sum from the upper tail. In the case of a continuous distributon the most 
































F natural point to take is the median of the sampling distribution and then we may take for P 
35 a in the transformation, double the probability integral calculated from the appropriate tail. 
Se : ‘ ak . . 
= ae This was the procedure suggested by Sukhatme (1935). Mainland (1948) has suggested using 
= : ° ° . : ° . : 

Bos such a method when dealing with discrete distributions, but without some adjustment 
5. 2 se 7 ‘ 3 A 

x Eo it may lead to values of P greater than unity. It is worth studying some examples to 
2aFe appreciate the difficulties more fully. 

ws 

-8- a ; ° = . ; 

moe 6. Examples of some difficulties arising in the transformation, y* = —2log, P. (a) The 
a nan ° . . . i a 

cig expectation of x? is always below 2 for discrete populations when the null hypothesis is true, 
ee : : . ; ‘ . s ; 

Bos as we now illustrate in the case of the binomial (4 + $)*. We may tabulate the possible events, 
RSs g their relative frequency and the corresponding values of P and (—2log, P) for a one-sided 
2E8s comparison: 

Saks 

See o 

SEE 3 

Eos & Sere eee 5 

sO # | ? : 

BO & = © itn vee, Relative Curaulative | 

Se ee eT frequency probability —2log, P 

sae a (m) of m (P) | 

ta oS | 

Sem Ss - 

EOS 5 oR 

Sot t 4 | 0-0625 1-0000 | 0 

S53" 3 | 0-2500 0-9375 0-12908 

———" 2 | 0-3750 0-6875 | 0:74939 

Bam l | 0-2500 0-3125 2-32630 

$50 | 0 0-0625 0-0625 554518 

2o6 | | 

—~ 

> 22 

& & We conclude that the expectation of x’, ie. of —2 log, P, is 1-241, when the null hypothesis, 
n Om ot oa 

SSé that p = 0-5, is true. 

= & S For a two-sided comparison, we obtain the results set out below: 

PrP 
a S85 a ) | 
. «1 Cumulative 
b Successes | Probability er Fal = ‘ 

S in) | of (m) — 2 log, F 
> | (P) 
= en eee eee = 
0 or 4 0-125 | 0-125 4-15888 
lor 3 0-500 | 0-625 0-94007 
2 0:375 1-000 0-00000 














Here, the expectation of y? is 0-990 instead of the theoretical value of 2. It will be noted that 
in this case the division point is at m = 2, and the procedure adopted is equivalent to summing 
from the lower tai! on half the occasions when m = 2 and from the upper tail in the other 
half and, then, taking P as double this tail sum. 











376 Combination of probabilities 


(6) When, however, the distribution is asymmetric, a further modification is needed. 
Consider the case of the binomial (3+ 4)°; here the mean lies between m = 1 and m = 2. 
Col. 3 of the table shows the cumulated probabilities from either tail. In the final column, 
instead of multiplying by 2, we have adjusted P by dividing by the proportion of observations 
in the respective tail. This avoids the possibility of having a probability, P, more than unity. 























Successes Probability Se Adjusted 
fm P tte, a 
(m) . from one tail 
0 0-131687 0-131687 0-28570 
1 0-329218 0-460905 1-00000 
2 0-329218 0539094 1-00000 
3 0-164609 0209876 0-38931 
4 0-041 152 0-045267 0-08397 
5 0-004115 0-004115 0-00763 





(c) As another illustration, take the binomial ($+4)*. Here the question arises—how 
should the observations corresponding to a value m = 2 (the mean) be apportioned? Fol- 
lowing the method of the preceding example, we should get different adjusted values for 
P and therefore different expectations for —2log, P, according to the tail to which we 
assign samples with m = 2. It would, of course, be possible, but troublesome in practice, 
to obtain a median division by using some random method of allocating samples having 
m = 2, to the lower tail with a probability of 0-14884 and to the upper tail with a pro- 
bability of 0-18038. 





Cumulative 

Successes Probability probability (P) 
(m) of m summed from 

appropriate tail 





0 0:08779 0-08779 
1 0-26337 035116 
2 0-32922 ? 

3 0-21948 0-31962 
4 0-08231 0-10014 
5 0-01646 0-01783 
6 0-00137 0-00137 

















7. The mean value x*. To illustrate the solution proposed, suppose we are dealing with 
samples of 5, that the null hypothesis is that p = 4 and that we are using a one-sided test, 
i.e. only concerned with the alternative that p’ > 4. The following table shows the relevant 
quantities. On the null hypothesis, y = — 2 log, P can assume the six discrete values given in 
col. 4 with the corresponding relative frequencies given in col. 2. In Fig. 1 the latter have 
been plotted as ordinates against the former as abscissae. If P were a continuous variable, 
then y = —2log, P would have the exponential frequency function f(y) of equation (7); this 
curve also is shown in the diagram. The mean of the frequency curve is at 2. The mean of the 





is needed. 
nd m = 2, 
al column, 
servations 
han unity. 





J 


ises—how 
1ed? Fol- 
values for 
which we 
practice, 
es having 
th a pro- 


ing with 
led test, 
relevant 
given in 
ter have 
variable, 
(7); this 
in of the 


H. O. LANCASTER 377 


discrete distribution (i.e. the sum of the products of the corresponding numbers in cols. 2 
and 4 of the table) is 1-314. There is clearly a considerable bias. 


Case of binomial (% + 3)°; one-sided test 












































Relative Cumulative 
i frequency probability y; = —2log, P; ¥: — 2log, (P5+ Pia) 
Pi — Pia P; 

(1) (2) (3) (4) (5) (6) 
0 0-131687 1-000000 0-0000 0-1379 0-1362 
1 0-329218 0-868313 0-2824 0-7213 0-7028 
2 0-329218 0-539095 1-2357 2-0329 1-9644 
3 0- 164609 0-209877 3-1225 4-2788 41181 
+ 0-041153 | 0-045268 6-1903 7-7108 7-4026 
5 0-004115 uf 0-004115 10-9862 12-9862 12-9862 

Expectation Variance 
x? with 2 p.¥. (theoretical) 2-000 4-000 
Yi 1-314 2-482 
7: 2-000 3-689 
—2log, Ps + Pia) 1-932 3-443 
06+ 
O5-r 
04+ 
~ 
= > 
8 03r 
2 
: 
a 
02F 
O1- 
T 1 tT t t ' u 1 v t ? 
0 1 2 3 4 5 6 7 8 9 10 11 12 
y= —2 log. P 


Fig. 1. (a) Frequency curve f(y)= $e-!” and (b) probabilities for the binomial (§+ 4)°. 


In general, suppose that there are possible s successive values of —2log, P and that we 
denote these by 
y;, = —2log.P, (¢ = 0,1,2,...,8—1). (8) 
Call P, = 1, yp = 0, P, = Oand y, = 0. Then 


hos [" “Aly) dy. (9) 


Thus the bias in the expectation of y; arises because the frequency which in the continuous 








378 Combination of probabilities 


distribution is spread between y; and y;,,, is concentrated in the discrete distribution at the 
lower end of the interval. To correct this, we propose to replace y; in the test by 


*Vi+n Vi+r 
yi = | uf) dy | fly) dy (10) 
vi vi 
Pi 
-| (—2log, P)dP/(P;— P,.1) = 2—2{P log, P;— Ps loge Piah(Pi— Pixs). (11) 
Piss 
It will be seen from equation (10) that we are replacing y; = — 2 log, P; by the mean value 


of y in the interval y; to y;,,, as given by the frequency function f(y) appropriate for the 
continuous variable case. By doing this we ensure that the expectation of 9, is 2, for 
a-1 s—1( 7Pi \ 1 
£Y) = & (Fi-Fiad¥i = X | (—2log, P) dP -{ (—2log,P)dP=2. = (12) 
i=0 i=0 Piss j 0 
Further, the variance of #; will approach 4 from below, as the number of groups is increased. 
To show this, we note that the variance of the frequency function, i.e. of the distribution 
of x? with 2 degrees of freedom, may be split up as follows: 


« 1 CVi4. 
[tw )(y—2)?dy = =I] fly) (y—2)*dy! 


i=0 Vi + J 
s-1 


= 1 f(y) Y;— 2) dy+ — fwe- 7.) dy} 


i=0 
am > > 7 > : 
= 5 (P-Pia) Gi 2+ ic v(y-H)dy}. (13) 
The first expression on the right-hand side of (13) is aa variance of ¥;, which therefore falls 
short of the variance of x’, i.e. of 4, by the second term on the right-hand side of (13). This 
latter expression depends on the width of intervals of y imposed by the null hypothesis and 
will usually be small, unless there are very few groups, 

We see from (11) that, in the event of the observation being the most extreme possible, 
¥,-1 = 2—2log, P,_;, (14) 
since P, = 0. 

If we write x5, for 9;, P for P, and P’ for P,,,, we have the form — mean value x? given 
at the beginning of this paper in equation (1). The distribution of x2,, or 9, is of course still 
discrete, but since it has an expectation of 2 and a variance slightly under 4, it may be ex- 
pected that if k independent samples are combined by summing the values of y2,, the resulting 
statistic will be distributed approximately as x? with 2k degrees of freedom. The bias involved 
in summing — 2 log, P has, at any rate, been eliminated. 

In order to avoid the tedious calculations necessary for y?,, we have defined 


x2 = —2log, }(P,+ P,,,) = —2log, }(P + P’) (15) 


for all values of P’, except when P’ = 0. In this case, y?, has the simple form _— by 
equation (14), and we assign this value also to y;?. This slightly reduces the bias of y‘2, which 
has an expectation a little below 2. It is found, for the ordinary sets of probabilities which 
may arise in practice, that x; and y;? approximate closely. Their expectations for a number of 
cases are given in Tables | A and | B, and are seen to be approximately equal. For the example 
considered at the beginning of the present section, the individual values of y%, = 9, and 
X2 = —2log, (P+ P,,,) are shown in cols. (5) and (6) of the table on p. 377; expectations 
and variances are compared below the table. 








H. O. LANCASTER 379 


THE FOURFOLD TABLE 


8. In considering the problem for a fourfold table, we shall illustrate the position on the 
case where it is supposed that samples of ten are drawn randomly from two populations in 
which the chances of an individual possessing a certain character are p = (1—q) and 
p’ = (1—q’), respectively. The sample results will therefore be recorded as shown below. 
On the assumption that p = p’, and within any set of samples for which a+ 5 = constant, 
a follows a distribution of hypergeometric type with 


E(a) = }(a+b), Variance of a = (a+b) (20—a—b)/76. 


No. with | 


| 
| 
et : No. without Totals | 
character | i 
| | 
Ist sample z 10-—a 10 | 
2nd sample b 10—b 10 
i 
ee i oe ae == | 
Totals a+b 20—a—b 20 


The definitions of the quantities P, P’, y?,, v2 given in § 2 for the binomial can therefore be 


ms 


extended to apply to the hypergeometric. Further, for the crude x? and x2, we have 


x? = 19(a —b)?/{(a +b) (20—a—b)}, ) 48) 
‘ | (49 
x2 = 20(|a—b | — 1)2/{(a +b) (20—a—b)}.) 


To compute —2log, P we calculate the sum of the hypergeomtric terms in the usual way, 
assuming a+6 fixed, and make a two-tailed comparison. We note that under the null 
hypothesis, p = p’, 
B(x2) = 1, (17) 
E(x?) = 20{1 — (mean deviation — })/(variance)}/19. (18) 
If p +p’, the expected frequencies of a and 6 will be given by the terms of the product of the 


two binomials, (p+q)' and (p’ + q’)'*. Thus 


P(a,b| p, p’) = °C, pq! x °C, pq". (19) 
Using (19) we can compute, for given p and p’, the probability or relative frequency of each 
fourfold table that can arise and hence obtain the expectations of y*, x2, ete. Results so 
obtained are given in Table 2. It will be seen that the expectation of x2, and also that of the 
probability integral transformation (—2log, P), is always less than the theoretical value, 
when the null hypothesis is true. The advantage of using y rather than y, in repeated sampling 
was discussed from a rather different angle in connexion with the fourfold table by E. S. Pear- 
son (1947, pp. 151-6). Both the expectations of x? and —2log, P are also less than the 
theoretical value when there is some moderate departure from the null hypothesis. The crude 
x? as defined by (16) and the x”, both have an expectation of the theoretical magnitude when 
the null hypothesis is true and an expectation above the theoretical, 1 or 2 respectively, when 
it is false. In the case of the binomial, we were able to compare the effect: of using various 
methods of combination by an enumeration of cases and a calculation of the probabilities 
of obtaining a result significant at the 5 % level, when 2, 3, 4 and a very large number of 








380 


Table 2. A table to show the expectations of the various x? calculated from 


Combination of probabilities 


the fourfold tables as formed by the methods of this article 
































Expcctations of 
Null Value Value 
hypothesis of p of p’ 
i Crude x? —2log, P x3, 

True 0-5 0-5 0-482 1-000 1-040 2-000 

0-45 0-45 0-480 1-000 1-035 2-000 

0-40 0-40 0-471 1-000 1-021 2-000 

0-33 0-33 0-448 1-000 0-984 2-000 

0-25 0°25 0-387 1-000 0-904 2-000 

False 0-55 0-25 1-715 2-738 2-989 4-292 

0-50 0-25 1-326 2-229 2-408 3-631 

0-45 0-25 1-009 1-805 1-906 3-080 

0-60 0-40 1-009 1-751 1-904 3-007 

0-55 0-45 0-612 1-188 1-257 2-255 








experiments are combined. In the case of the fourfold table the computational work would 
be prohibitive, so we have calculated a number k such that & x E(?) is greater than the 5 % 
point of the tabulated y? for the appropriate number of degrees of freedom. In other words 


Table 3 





Value of p 


Values of k for different methods of combination 





Crude x? 


—2log, P 


Mean value x? 





0-55 
0-50 
0-45 
0-60 
0-55 





| Value of p’ 
| 
| 


0-25 
} 0-25 
| 0-25 
0-45 
| 


200 (approx.) 











14 3 
100 6 
. i2 
. 13 | 
ad | 300 (approx.) | 





14 
60 


10,000 
10,000 
* 








we are calculating how many experiments we shall have to combine to give a power to our 
test of 50 %, assuming that in a number of experiments added together in this way there is 
a probability of 0-5 of obtaining a total greater than its expectation. The results are tabulated 
in Table 3. Thus, for example, when p = 0-55, p’ = 0-25, we obtain from Table 2: 


* In these cases the test cannot be said to have power. 











kx E(—2log, P) 
= kx 2-989 





Degrees of 
freedom (2k) 


5% point of x? 
for 2k degrees 
of freedom 








38-885 
41-337 

















H. O. LANCASTER 381 


Thus, we conclude that there is not a 50 % chance of establishing significance at the 5% 
level, using —2log, P, until fourteen samples are available for combination. On the other 
hand, for these same values of p and p’, using crude x? and combining three samples we find 
that 3 x E(x?) = 3 x 2-738 = 8-214, and is greater than the 5 % significance level of x? for 
3 degrees of freedom, namely, 7-815. 

It is to be noted that the crude x? is slightly more powerful than the mean value x3,, but 
that both are very much superior to x? or to the x? obtained from the usual form of the pro- 
babili y integral transformation. Where the expectation of x? in the fourfold tables is below 
the theoretical value, we have assigned no value to & since it is clear that an indefinite 
repetition of the experiment would not add to the power of the test. 


9. An instructive example. The following fourfold table is taken from Mainland (1948, 
p. 54). The animals are from his ‘group A’. 














Died Survived Total 
Treated animals 0 12 12 
| Control animals 2 10 12 
Total 2 22 24 

















We find that in the set of samples with these margins, the exact probability that a = 0 on the 
null hypothesis is 11/46 or 0-2391304, and the corresponding value for x? is 6/11 = 0-5455. For 
all possible tables with this set of marginal totals, E(2) is only 0-2609 if the null hypothesis 
be true. Mainland, however, comments thus: ‘Again it appears that the treatment tends to 
reduce mortality, but in neither group, with P = 0-025 as the standard, is the difference 
significant. Group A, n = 12, P = 0-2391.’ On the next page he goes on to use this value of 
x? and a similar one from group B to obtain a x? for two degrees of freedom. We may note 
that the observation recorded is the furthest possible from the null hypothesis, and yet, 
were we to obtain it repeatedly, we should soon obtain a significantly low value of x?. Such 
usages cannot be defended. In the example above, —2log, P has an expectation of only 
0-9694 for the two-tailed test in place of the theoretical 2. However, we could calculate 


a mean value y?, which will have an expectation of exactly 2 on the null hypothesis. 


10. An advantage of the probability integral transformation. We have seen that the pro 
bability integral transformation, using x2, or x‘? oi equations (1) and (2) has approximately 
equal power with the crude x’, when used in the summation of two way tests from the 
binomial distribution and fourfold tables; it has, however, the additional advantage that it 
can also be used for the combination of one-way tests and thus derive additional power in 
those cases where the departures from the null hypothesis we are looking out for are in the 
one direction. We can see that this is equivalent to the addition of a quantity (— 2 log, }), 
or 1:39 approximately, to the x2, or x2 of every experiment which gives a probability 
near zero. Of course, similar transformations could be found whereby the probability from 
an experiment can be transformed to the scale of x? for any number of degrees of freedom, but 
the distributions of x? for 1 and 2 degrees have obvious advantages. First, the data are often 
already available in the scale of x or of x? for | degree of freedom; secondly, the simplicity 








382 Combination of probabilities 


of the probability integral transformation is a very real advantage. Further, the suggested 
mean value y?, allows significance tests to be carried out and interpreted when the summed 
x3, is low. Such tests cannot be carried out with the ordinary x? = — 2 log, P transformation, 
since it has an expectation below the theoretical value, even when the null hypothesis is true. 


SUMMARY 


11. Some numerical investigations have been carried out in the case of the binomial 
distribution and for the fourfold table. It is shown for the combination of significance tests 
that the usual probability integral transformation test loses much information and a slight 
modification is suggested. It appears evident that there is no justification for the use of 
x2 in the combination of experiments. For combination of two-tailed significance tests, the 
crude x? is the next most powerful test after simple pooling, and it can be readily used in 
the combination of fourfold tables where pooling may not be possible. When the use of 
crude y? is not practicable, it is suggested that the mean-value form of the probability integral 
transformation y?,, defined in equation (1), should be used. A simple approximation to 
x3, is derived and given in equation (2); this we have termed the median value y?, ‘2. For 
the combination of probability tests when it is desirable to pay regard to the direction of 
the observed deviation, either the mean or median value probability integral transformation 
is superior to the crude x”. 


This work was carried out while the author was a Rockefeller Fellow in Medicine. He would 
like to thank Prof. A. Bradford Hill of the London School of Hygiene for the facilities of his 
department, Dr J. O. Irwin and Mr P. Armitage for help in clearing up doubtful points 
and to Prof. E. S. Pearson for advice and help in redrafting the article. 


REFERENCES 


Cocuran, W. G. (1942). The x? correction for continuity. Iowa St. Coll. J. Sci. 16, 421. 

FIsHer, R. A. (1932). Statistical Methods for Research Workers, 5th ed. Edinburgh: Oliver and Boyd. 

MAINLAND, D. (1948). Statistical methods in medical research. I. Qualitative statistics (enumeration 
data). Canad. J. Res. 26, i. 

NEYMAN, J. & Pearson, E. 8. (1933). On the problem of the most efficient tests of statistical hypo- 
thesis. Philos. Trans. A, 231, 289. 

PEARSON, E. 8. (1938). The probability integral transformation for testing goodness of fit and com- 
bining independent tests of significance. Biometrika, 30, 134. 

Prarson, E, 8S. (1947). The choice of statistical tests illustrated on the interpretation of data classed 
ina 2x2 table. Biometrika, 34, 139. 

Pearson, K. (1933). On a method of determining whether a sample of size n supposed to have been 


drawn from a parent population having a known probability integral has probably been drawn at 
random. Biometrika, 25, 379. 


SUKHATME, P. V. (1935). Proc. Ind. Acad. Sci. 2, 584. 


Yates, F. (1934). Contingency tables involving small numbers and the yx? test. J. Roy. Statist. Soc. 
Suppl. 1, 217. 





s 


ot 





[ 383 ] 


NOTE ON THE APPLICATION OF FISHER’S k-STATISTICS 
By F. N. DAVID 


1. The k-statistics were introduced by R. A. Fisher in 1928. In a paper, remarkable not 
only for the brilliance of the statistical techuique but also for the condensation of the mathe- 
matical argument, he set out the essential properties of the k-statistics, demonstrated how 
powers and product moments and cumuiants could be evaluated, and gave basic tables for 
their use. Subsequent development has been carried out by J. Wishart (1929, 6, 1930, 1933) 
and M. G. Kendall (19404, 6, c, 1942). Wishart worked chiefly on the multivariate case, while 
M. G. Kendall concerned himself principally with proving rigorously the rules of procedure 
set out by Fisher. In the course of his analysis he gave certain methods which enable some of 
the algebra involved in Fisher’s original procedure to be cut short. Nevertheless, even with 
these short cuts the algebra necessary is heavy, and it appeared worth while to investigate 
the possibility of shortening it still further, particularly for those cases where it is not possible 
to make the x, of the parent population equal to zero. 


2. The technique involved in the application of the k-statistics to any problem of dis- 
tribution may be summarized briefly in the following way. We suppose that it is desired to 
find the sampling moments of some statistic involving powers and/or products of the one-part 
symmetric functions 

n 
&= S24 (r= 1,2,3,...). 
i=1 
The statistic is written in the form of the k-statistics by direct substitution. Using Fisher’s 


notation we write 
E (KE REIS 2) = (12 2° 374°). 


This product moment, which should properly be written as 4’(1*2/37 4°), is connected with 
the product cumulants through the identity 


tf 


K(r) +K(6) 2, pt kl) + K(rs}t 8 + (or) 


a 
=log(1 +p (nya ite (0), ite ee, ite — ite! 3 +), 


the extension to any number of variables being obvious. 
Fisher’s basic tables give expressions for the product cumulants in terms of the parent 
population cumulants and hence by substitution ,’(1*2/37 4°) may be evaluated. 


3. The relations between the product moments and the product cumulants can most 
easily be derived by making use of an ingenious process due to Kendall (1940c), which con- 
siderably simplifies the labour. Since this method is not widely known we illustrate it here 
for fourth powers and two k-statistics. We start from the well-known relation that for the 
parent population 

My = Ky t 45K, + 3x3 + 6x, x3 + x4, 


and hence for the fourth power of the k-statistics of order r we may write 


fe (r*) = K(r*) + 4x(r9) (7) + 3x2(r?) + G(r?) K2(r) + K4(r). 


Biometrika 36 25 








384 Note on the application of Fisher's k-statistics 
Operating on y’(r*) by s0/ér, and cancelling the factor 4, we have 
(738) = K(r3s) + 3x(r2s) K(r) + K(r3) K(s) + 3x(r8) K(x?) + 3x(r?) K(r) K(8) + 3x(r8) K2(r) + K3(r) K(s), 
where kK(r)=x, and «(s)=k,, 
the parent population cumulants of order r and s. Again operating we have 
ft’ (r?3®) = K(r%s?) + 2x(rs®) K(r) + 2x(r2s) K(s) + K(8?) K(r®) + K(s) K2(r) + (1?) K2(s) + 2x?(rs) 
+ 4x(rs) K(r) K(s) +x?(r) K°(s), 


results which can be checked from the fundamental identity given above. For three variables 
we operate on the appropriate power of r by s0/0r and t0/dr and so on. 


4. It is clear that in order to evaluate y’(r*s!) when h+1 = 4 or more, a great deal of 
elementary algebra is required. It is suggested, therefore, that attention might be focused on 


p(r*s') _ E(k, Se a (k, me; k,). 
It is easy to show, using the relations of § 3, that for h+/ = 3 or less 
(rs) = «(r*s"), 
as might be expected. For h+/ = 4 we have 
M(rt) = &(k,—K,)* = K(r*) + 3x%(r°), 
Lr?) - E(k, — A (k,—Ks) _ k(r3s) + 3x(rs) k(r?), 
p(r*s?) = &(k, —K,)? (kz — Kg)? = K(12s®) + K(s?) K(r?) + 2«7(rs). 


The algebra involved by direct substitution is, however, a little heavy, and it is simpler 
to note that 


E(k, k,)° (k, bas Ks) and E(k, ‘a k,)? (k, ae k,)? 
can be derived from Mg = Kgt 3x} and = y(r*) = K(r*) + 3x°(r?) 


by Kendall’s operative process just described. The relations for h +1 = 5, 6 may thus easily 
be shown to be 


E(k, —K,)> = plr5) = K(r5) + 10«(r3) «(r2), 

fe(r4s) = K(r4s) + 6x(r2s) K(r?) + 4x(r3) «(rs), 

je(r?s?) = K(r3s*) + 6x(r2s) K(r8) + 3x(rs?) K(72) + (73) xK(s?), 
E(k, —K,)° = u(r*) 


k(r®) + 15x(r4) x(r?) + 10«2(r3) + 15x°(r?), 
f(r>s) = K(r5s) + 5x(r*) K(rs) + 10K(r3s) K(r2) + LOK(r) x(r2s) + K2(r2) K(rs), 
f(r4s®) = K(r4s*) + «(1r*) K(s®) + 8x(r3s) K(r8) + 6x(r2s?) K(r2) + 6x2(r28) 

+ 4x(r°) «(rs®) + 12«(r?) K2(rs) + 3x2(r®) x(s?), 
Je(r388) = x(r3s%) + 3x(r3s) «(s?) + 9x(r23) K(r8) + 3x(r8%) K(r?) + 9x(r2s) K(rs?) 
+ x(r°) «(8°) + 6x3(rs) + 9x(r?) K(s?) K(rs). 


5. In order to avoid confusion with Fisher’s notation we shall writ2 


j(r*s!) = &(k, —K,)* (ky —K,) = (r*8!). 





It wot 
involv 
we co 


in sal 


We h 


The! 
the r 


The 


reme¢e 


Ref 


cur 


tu 


F. N. Davip 385 


It would seem that if a table of (r*s') functions were available a certain amount of the algebra 
involved in the application of k-statistic technique would be avoided. For example, suppose 
we consider the almost classical example of finding the moments of 


in samples of n drawn from a normal population and take the process of obtaining &(z°). 
We have 


= s. a ye. im -3 
6 (et) = PDE) (Gg 4) = SE (ap (144) ). 


The method as outlined by Fisher (1928) and quoted by Kendall (1943) consists in expanding 
the right-hand bracket and taking an cie 


6) = Be: fe | (32) —— y'(32 2) + al! (2) — pt'(3? 23) +... } 


The product moments are next put in terms of the product cumulants x(r*s’), whence, 
remembering that k, is measured from its expected value, we have 


&(a?) = ee [ «3)—= x, 2) +3 3 (ets? 22) + «(3®) x(2%)) 


a © (3? 25) + 3x (3? 2) (2%) + (32) «(23)) + * ; 
2 


Reference to Fisher’s tables gives the product cumulants in terms of the parent population 
cumulants and the process is complete. As an alternative procedure suppose we write 


6 (et) = PONS 6( (14 5=9)'(14 =)”. 


6n i? Ke 


Upon expansion and taking expectations we have immediately the (r*s’) functions, and all 
that remains is to substitute from tables of these functions to obtain &(z*) at once in terms 
of the parent population cumulants. The saving in heavy algebra is considerable, even 
though the original expansion is made a little more clumsy.* 


6. A complete table of (r*s') functions is obviously not necessary. For h+/< 3 the expres- 
sions for these functions are the product cumulants already tabled by Fisher or by Kendall. 
When h+/>3 a corrective term must be added to the product cumulant. Table 1 gives 
a selection of these corrective terms, expressed as functions of the parent-population cumu- 
lants, which have been found useful in certain recent investigations. It will be noted for use 
in deciding order in expansions that product cumulants of weight r will be of order 1/n*—. 
The corrective terms to a product cumulant of weight r will be of order 1/n’-* if the corrective 
terms are two product cumulants multiplied together, of order 1/n’-*, where they are three 
and so on. Thus for h+1 = 6 we have x(r*s!) is of order 1/n5, x(r™) x(r"s') is of order 1/n* and 
K%(r?) is of order 1/n’. If, therefore, no terms are required beyond those involving 1/n° it 


* For symmetrical populations it is of course simpler to note that k, and k, are uncorrelated, and to 
turn the product of the expectations into (r*s*) functions. 


25-2 








386 Note on the application of Fisher's k-statistics 


will only be necessary to enumerate a single corrective term as the contribution from 
k(r's') (h+/ = 6). In Table 1, printed on p. 392 helow, the complete corrective terms are 
given for h+/ = 4,5, and those corrective terms for h+1 = 6 which involve 1/n’*. 


7. The question of the validity of the expansions in § 5 is a troublesome one. I have given 
elsewhere reasons showing that such a process can be justified in an approximate way for 
some statistics. For other statistics it is difficult to justify except in a limited range of cases. 
An example of this can be found in the coefficient of variation. We define this, for a sample 
of n elements, as 


2 
€ 
y* = =? 
where ¥ has its usual meaning and 
l n 
2 ~ 
st =—— ¥ (a, -7) 

nm—1 iri 


In samples from a normal population it is clear that no matter what the mean and standard 
deviation of the parent population, the probability exists that Z can be close to zero and can 
be negative, and hence the true distribution of v has, theoretically, infinite moments. In 
order, therefore, to obtain approximations to the moments of v it will be necessary to truncate 
the parent population at the point z = 0. We must then make the further assumption that 
n is chosen large enough for the proportionate frequency lying outside a given multiple of 
the standard deviation of the mean and of the standard deviation of the variance to be 
negligible. Under these two assumptions the expansion will be valid. If the parent population 
is not normal but is, for example, a Pearson curve with start fixed at zero or at some positive 
value, then it is necessary to make only the assumption regarding n, the sample size. 

8. It is assumed that we have a parent population which has cumulants «,, Kg, ... up to 
any order required. A sample of n elements is randomly and independently drawn from such 
a population, and to each element of the sample we attach a random variable 2,, 9, ...,% 


We may then write 
v =_ 28 _ us} —_ (1+ 2—"*)'(14 5) > 


& ky 2 Ky 


n* 


ak 
where V =—, 

Ky 
and is the coefficient of variation in the parent population. Under the restrictions just men- 
tioned we may expand the right-hand side of the expression for v, and, for example, for the 
second moment about an arbitrary origin we obtain 


Fa6(0%) = a(t +") (1 anil See ol, ee), 








Ks Ky K3 st Kt 
or in terms of the (r*s') functions 

1 2 = 4 5 

== (v2) = 1-— (1) + (12) — (13) 4 2.14) — 

y2 ( ) m + al ) a )+al ) 
ee ome aT: 

+= @)-——- #1) + 6™—,—- 

6m, OD tag, OP) ag + 


It is obvious that (1) and (2) are zero, and the other terms may be written down from tables 
of product cumulants and Table | of corrective terms. 





Fe 


Ps 


i: 





r 


3. 


F. N. Davip 387 
9. The first four sampling moments of the distribution of v ave, to order 1/n?, 


Lo ra |. ae 1/ ix, ks | k 1 
760) = Hip) = 145 8x3 patti) 4(n—1) 


al 15 Ki 3 xyky, Ke 1 KR | Ks 3 Ky 5ks =) 





128xK4 16K3K, | 16K3 4x32 8x, x2 BKK, 2° KA 





™ 1 91 S. oo . 1g, 1 
nar Skyk, 4K? tae 32)’ 


v L/lx Ky « l 
Elv)? = 7) {S&F 
(v)) mF = Fe Kk, *3n— 1) 


(go Lkyk3 Ke 1 Ky 1 Ks 5 Ky 10K; =] 

















m2\32x4° 463K, 8x2 4K32K2 4,42 22K, 


1: (5K, 1 Ks -") 1 ing 1 
——__{— $4 —_3___$)____(_- +}, 
aera 2KeK, KG ys 7 


=(- 3xi lk, 9 Ke 3 Ky 3K, 10K, a) 











4 3 <2 2a. xe 4 
16x44 Bo 4K2K2 4nk, KK? 


Po oee Bs Hs) 4 leg tt 
aaa ee K,X, a (n— 1)? an 4 


Wa (fea 3K4Ks 3K3 3 Ky 3K? 2) 





n2 








1644 2exK OKeK A 8 
Gx 2xix, Kixi 3x, GG 


1 3K, 3K, =a) 1 (3 
+——.. —~—$.~2)4—_,(-}. 
ae Kak, Ki) ae 


For samples from a symmetrical ail: ; 
v 1k l 15x}. kK 3K 3K 
ees KX ee a a, Ke Ky Ks 
(5) (- 8K +44) ot i 128«4 + Tod * Bik,” “ 
1 (9k, 13) 4 1 
+n 32 «3 }  32(n—1)?’ 
v l/lKk, “a) ] 1(7«K Ke 5 Ky,  8xg 1 5k, “| 1 
od Rode lemteer ~~? a = oa = ae mt aT ee eae - as — a 
rly Agate ” 2(n — 1) al; 32K$ 8x3 : Oxi, Ki} n(n— hae Ki} = 8(n—1)? 


’ °) = 1 3 i, Xe, 3 BK, + OKs . | a+ wale l 
rs ry @\ Gea x2 xia, a) n(n—1)\4x2> x2)” 4(n—1)?” 


v' 1/3 i 3 ky 4 ‘ 1 [SK , Se ‘ 3 
* al Bh rae ips: | ye 
mlz) =; iat 2KiK, a) = ea Ke 4(n — 1)?" 


For samples from a normal population : 








,[v y2 l 3Vé y2 l 
Pe v2 se een re 4n(n—1) * 32(n— 1)?” 
8V 4 l y2 
Me 7 7) = (n—1)* n® . 8@—1)? n(n—1)’ 
(s)-! = a 3V? l 
(n—1)* 4(n—1)”” 


é lh Yee 
Ma "= n? Tan—1) 4(m — 1)?" 








388 Note on the application of Fisher’s k-statistics 


10. In order that the truncation of the parent population at the point x = 0 should play 
as small a part as possible, it will be necessary to choose V small. Thus for the case of a normal 
parent population it is assumed that V can never be greater than }, or in other words it is 
assumed that the truncation is made at a point farther away from the mean than — 3c. It is 
understood that in practical problems it is rare to find a V as large as 4. The manipulative 
algebra involved in obtaining the approximations to the moments given above is elementary 
but heavy, and it would be satisfactory if it were possible to apply a check of some kind. 
This is difficult for the general case. We may note that for the normal population, with V 
small and n large, we have 


v 1 
rly) + gpl +27), 
a value given, it is believed, originally by K. Pearson. For the general case Kendall (1943) 
sii {z)-1 mw, Mt.) es ee 
PAV) =n Aut t* fet) 8606 4KE KE eK, | 2m’ 


which, for large n, agrees with the leading terms of the expression for .(v/V) for the general 
case given in the preceding section. These small checks were all that could be found. 








11. In spite of the restrictions made necessary in order to use the Sxpansion, it is possible 
to get some idea of the distribution of v/V from the approximate moments, particularly if 
all that is required is to gauge the approximate sample size for which it is reasonable to 
assume that v is normally distributed. We take the case where the parent population is 
normal; the momental constants for selected values of v/V and n are given in Table 2. 


Table 2. Momental constants of the distribution of v|/V when the parent population is normal 





Sample “Poi aes 
size v=} ras vices 





n KB o Bp, 








A, My o | Ay By My o A, Bs 
| 


| 
| 0-18 | 3-20 | 0-951 | 0-311 | 0-13 | 3-29 
| 











| 

6 | 0-970 | 0-346 | 0-31 | 2-93 | 0-956 | 0-321 

11 | 0-9855-| 0-246 | 0-17 | 2-94 | 0-9786 | 0-230 | 0-09 | 3-09 | 0-9759 | 0-223 | 0-06 | 3-14 
21 | 06-9929 | 0-175 | 0-09 | 2-97 | 0-9894 | 0-163 | 0-05 | 3-04 | 0-9880 , 0-159 | 0-03 | 3-07 
31 | 0-9953 | 0-143 | 0-06 | 2-98 | 0-9930 | 0-134 0-03 | 3-03 | 0-9920 | 0-130 | 0-02 | 3-05 
41 | 0-9965-| 0-124 | 0-05 | 2-98 




















ews 0-116 | 0-02 | 3-02 | 0-9940 | 0-113 | 0-02 | 3-03 
L 














It is clear that the distribution of v/V as judged from the approximate moments tends to 
normality reasonably quickly as the sample size is increased, and that for samples of over 40 
little error will be made in assuming that v/V is normally distributed. 


12. McKay (1932) considered the distribution of 


ep) 


‘ 


ae. 2S - 
where *=- and s*=-- >) (%,—2X)*. 
x& N j=} 





He sh 
when 
consi 


wher 
migh 


to be 
popu 
assul 
gene 
for ) 
Spor 
tail 

thre 
for ' 
of r 


les 
sic 
no 


eq 


ul 





F. N. Davip 389 


He showed that this quantity is approximately distributed as x? with n — 1 degrees of freedom 
when the parent population is normal. In the previous sections of this paper we have been 
considering 8, 

v= 


z’ 


where s? is the unbiased estimate of the population variance. It follows therefore that we 
might expect 1 v2 
n—1)(14+- 75) (73) 


to be approximately distributed as x? with n — 1 degrees of freedom in samples from a normal 
population. We may use this approximation of McKay to check the adequacy of the 
assumption that the distribution of v/V is normal when that of the parent population 
generating the sample is normal. The procedure is as follows. We choose a significance level 
for x?, say 5 % for the sake of illustration, and for a given » and V we calculate vg, corre- 
sponding to this level. Hence, assuming v/V is normally distributed, we may calculate the 
tail area corresponding to this vg.;. The results of such calculations for two values of n and 
three values of V are given in Table 3. They again suggest that the assumption of normality 
for v/V is not likely to lead to serious error in tests of significance, provided the sample is 
of reasonable size. 


Table 3. Estimated probability integral of the distribution of v]V corresponding to X39, 








n V=} v=} V=i6 
101 0-047 0-048 0-049 
41 0-045- 0-046 0-047 




















13. It is seen (for example in Table 2) that as n increases, the distribution of v/V becomes 
less and less dependent on V. We therefore !ock at the possibility of abbreviating the expres- 
sion for the second moment. Four possibilities may be considered. We assume that for a 
normal parent population v/V is normally distributed about unity with standard errors 
equal to 

1 1 v2 1 : 
‘ | ae ae ~ aa 
(i) iP (ii) Tien +2” yt, (iii) (5 +3m—-1) 5) , 
1 8V4 1 ys ) 


_. (ve 
OY) tomo t nF a@—1* al) 





The results for sample size 41 and different values of V are given in Table 4. 


Table 4. Various estimates of standard error of v/V for samples of size 41 








Expression for te, a alas 
standard error eas. sie V=18 
(i) 0-110 0-110 0-110 
(ii) 0-122 O115 0-112 
(iii) 0-123 0-116 0-118 
(iv) 0-124 0-116 0-113 


























390 Note on the application of Fisher’s k-statistics 


Approximation (i) is clearly inadequate for the size of sample considered, (ii) is not far off 
(iii) and (iv), and will probably be good enough for rough tests of significance. Actually, 
since it gives values less than the values obtained from (iv), the probability integral calculated 
from it will give values closer to the true significance level as judged by the x? distribution. 

14. It will be seen, as might indeed be expected, that there is a similarity between the 
moments of the distribution of the standard deviation estimated from normal samples and 
those of the distribution of v/V. Approximations to the moments of the distribution of the 
standard deviation can be found quickly, using the k-statistic technique. Writing 


n 














. * (4. — #2 
7 (n— a zy 
then 6(%) — (14 “2—"2)' — 1+5-@)- (24) + (33) -_>_ (34) 4 
kh) Ke r 163 128x4 ae 





8k, 1 l( kK WR L (9K, Wise 
A ee ee + ‘) + moe\3at aa)? 
mkz 4(n—1) | m2\16K3  128K4) © n(m—1)\32 43) © (n—1)2\32 ° 4x3 
and for a normal population we have 
8, 1 1 S 1 l 
é —_ = ] — pe ooo SO > 
(%) 4(n—1)* 32(m—1)®” rl's) 2(n—1) 8(n—1)?” 


8. \ _ Bet _* ee. 
M (3 ~ 4(n— 1)?” M AS ~ 4(n—1)?° 
Thus the moments of s,/x} are those of v/V, in which V is put equal to zero in the final 


expansion. Since the distribution of s, tends fairly quickly to normality it may be expected 
that the distribution of v/V will do likewise. 





15. We return to the approximate moments of v/V and consider the case when the parent 
population is not normal. It will be assumed that the parent population may be described by 
a Gram-Charlier Series, Type A, and we write for its functional form 


ct) |(u+ + t= 5") + +n (= a 4)), 


where H,(x) and H,(x) are the Hermite polynomials of orders 3 and 4 respectively. This 
population has cumulants 





y= am? | (“a 


- - - - = _ -2 - 25 - a 2 hap? ; : 
Ky, Ky, Kg, Kay Kj = 0, Ke = — 10K}, Ky, = —35KyK3, Ky = —35«j, and so on. 


If we put «, = Oin this population, we may eliminate any effect due to kurtosis, and similarly 
if we give Kk, a non-zero value and let x, = 0 we can study the effect of kurtosis when there is 
no asymmetry. The moments of the criterion v/V may be written in simple forms for these 
two cases. Writing the moments of v/V for the normal population, that is, when «, = «K, = 0, 
as ;(v/V | N), n,(v/V | N), ..., and putting 


2 
KS K 
= —= R —3 =; i ss ¥Y. 
A, x3’ P2 K2 / 


we have the following expressions: 





ar off 
ally, 
lated 
ition. 
1 the 
3 and 


f the 


K3 
KS]? 


final 
cted 


rent 
lby 


“his 


F. N. Davip 391 
Moments of v/V when x, = 0 in the Gram-Charlier population: 
(*\_ fly l §.=iF g 1 5V2 
ni(F) r. mf N)+A (a= 1p? 8n? -Za)+? vl -5; on * Bn(w—1) 2n2)” 
iy ei.--3 5 l y2 l a} 
m7) =o) Alana Ip? an? :)- Vvb(5+ 2n(n— ht n2 |’ 
1 9V?) 4 
N)+A(-3 an? * 2(n—1)?* oe WB sa tow)? 
v 3V2) , = 
nF) -_— (7 N)+A(5r ry? 7 ee 1)’ sr): 
Moments of v/V when x, = 0 in the Gram-Charlier population : 
(?) _ (2h St. 
vi( 5) 1 “ly ' Jere ~ nt 32n(n — Th 8n2 | 128n? % 
(2) and? ia ; Te Fe 
a = m7 NY) TY an 8nin—1)° Qn? |” 32n2”” 
S). ot* N)+ ee. Sd 
7) HAY |S Y21 anin—1) nm? _|~ 16n?’® 
Cay v) on ee 
ra 7) = mp he +Y2 4n(n—1)  2n? + T6n2 


A third alternative would be to let both x, and x, have given values, and to study the effect 
of combined asymmetry and kurtosis in the parent population. 


























16. The moments of v/V for some parent population which is different from normal 
having been obtained, the next step is a matter for choice. One may either, from a study of 
the £, and /, of the distribution of v/V fit the appropriate Pearson curve using the first four 
moments, or one may choose for v/V a distribution having the same functional form as the 
parent population. In either case the procedure enables an estimate of the probability 
integral to be made. 

17. Pearl (1905) gives the momental constants for the distribution of brain-weights in 
a number of races. We take the first line of his table and note that 413 Swedish males had 
brains of average weight 1400-481 g. The coefficient of variation was 0-07592, and the /, 
and f, of the distribution 0-0287 and 2-7964 respectively. We assume that these momental 
constants are those of the true distribution of Swedish male brain-weights. We then ask the 
following question: if a sample of 51 observations is available and known to come from the 
population with V = 0-07592, what difference does it make to the expected mean and 
standard error of v/V if we assume the distribution of the population to be normal, instead 
of, with #, and f, as given? Substituting in the general formula we have 


= (VIN) =0-9951, pQ(v/V | N) 
My(v/V | By By) = 09955,  pag(v/V | 2, Bg) = 90-0089, oy 9, = 0-094. 

For this Pea therefore, the mean and standard deviation of v/V appear to be little 

changed from those which would have been obtained if normality had been assumed. It 

appearslikely thatin the case of anthropometric data, where the deviations from nor mality are 

never very marked and the sample size is usually fairly large, that the expressions for the mean 

and standard deviation of v/V assuming the parent population is normal will be adequate. 


00101,  oy= 0-100; 


I should like to thank Professor E. 8. Pearson and Professor M. G. Kendall for helpful 
criticism. 


os 








392 


K(r* 8!) 


x(1*) 

x(2 1) 
«(2? 1?) 
«(2° 1) 
K(24) 

«(3 13) 
«(3? 1?) 
«(38 1) 


k(3*) 


K(4 15) 
x(4? 12) 
x(4° 1) 
x(3 2°) 


«(3? 2?) 


k(3* 2) 


k(r* 8") 
«(15) 


x(2 14) 
x(2? 13) 
x(2? 12) 
x(2*1) 


«(2°) 





Note on the application of Fisher's k-statistics 
Table 1. Terms to be added to x(r's*) to give (r*s!) 








h+l=4 

Add. 
3x3 
ne 
3K_K3 

n? 
Kaka 2K8 2K 

ne n® n(n—1) 
SkgK_  Gkaxt 

n® n(n — 1) 
3Ki | 12K, KS 12x$ 
n?  n(n—1)° (nm—1)* 
3K 4K 

n? 
Keke  2xg = 9x, K3 9x2 K, 6x3 





n? n® n(n—1) n(m—1)  (n—1)(n—2) 
Bkgky , 2TKGK, | 27KyKS 18K, K3 
n® n(n—1)° n(n—1)  (n—1)(n—2) 


























BKE 54K KyKy 54K eK | 36K gk} 486K, K?K, 324K, Kin 324x2xin 

nm n(n—1)  n(n—1) (n—1)(n—2) (n—1)®? — (n—1)?(n—2)  (n—1)?(n—2) 
243x2xK2 = .2.43x$ 108x$n? 
(n—ij® * (m—1)** (m—1)*(m—2)° 

3K Ko 

ne 

KeKk—  1Gxaki 2x3 48x,Kgk,  S4xix, 72K, Ky 1443 x2 « i8(n +1) 

“n? tn(n—1)’ nm? * n(n—1) * n(n—1) * (n—1)(n—2) * (W—1)(n—2) * (n—1; (n—2)(n—3) 

Bkgks 48K GK 5 Ks 144x2x, 102x,x? 216x,x,x? 432x,K2K, 72Kx,Ki(n +1) 





n® n(n—1) *n(n—1) n(n—1)* (m—1)(n—2) * (H—1) (n—2) * (W—1)(n—2) (n—3) 
Sxgk,  Gigxd  18Kkgkgk, 36 
Rw? “n(n—1)° n(n—1) © (n—1)? 








3 
K3kq 














Keak, 2k gKy | 2KE  24K,KyKy  OKIK, 9KyKS 18K, KE 
n? "n(m—1)° wn? * n(n—1) © n(n—1)° n(n—1)* (n—1)? 
6x,K3 4 20K3KS 12kin | 
(n—1}(n—2)  (m—1)?  (n—1)? (n—2) 
3k gk;  18kgk3Ke 27K KaKs 162K,K,K% 27K,K? 18x,K«3 . 162k3K, 108nx, x! 
n® n(n—1) n(n—1) (n—1)® “n(n—1)  (n—1)(n—2) (nm 1)? (n—1)?(n--2) 
Add. nies 
10K3K, 
—-? 


BkgKg  TKgKs  14kyk} 


























n® n§ nin — 1) 
Keke  Okgk, , 3x3 18K,K2 = (28n— 32) K2x, 8 - 
ne" on * on? * n(n—1) nin—1? ‘nm—1p 
4kgK,  6KsKy  L2KgK2 "Wk ykgkK, 16(n—2)K2 80K? 
“~~ n3 n*{n—1)° n{n—1) n*(n— 1)? tan—1)8 


lOxgk, , BWxgxt  "B0xkx,  320x,x2 40(n—2) x34 80m = 2) oa, _ OeKe 
n& * n(n—1) n{n—1) * n(n—1)? * n%(n—1)2 8°" n(n—1 2? (n— 1) 














n2 


=> 


«(3 14) 
«(3? 15) 


«(3* 1?) 


K(rs h') 


x(18) 
«(2 15) 
«(2? 14) 


«(25 15) 


F. N. Davin 393 


Table 1. (cont.) 























h+l=5 
6K5Kq , 4K3K 
Sy, Ss 
Sik Kaks , OksKy 27x,x? 90K Kako 9x3 s 60x,«3 
ns ns n® “n%{n—1) n&{n—1) n{n—1) n(m—1)(n—2) 
6k7Ky  KgKs 27 2 , 3KsKg , 27(3N—4) KgK3Kq 81 27(4n —7) 
ns n3 tan —1)*8t n3 n*(n— 1)? oo" n¥1—1)° n%{n—1)? 
27 a 18 54(4n—7) ) 162x2xK, 
ta{n—1) Kad + xed 1) (n—2) * n(n—1)*(n—2)) * nXn—1) 


165(5n — 12) 108 36(7n? — 30n + 34) 7m 108(5n — 12) -_ 
n(n — 1)? (n—2) qanhece)* n(n —1)?(n—2)? * * * (n—1)?(n—2)* * * 








+m Ky3( 


h+l=6. Corrective terms involving 4 only 
Add. 
15x3 


ni 





15K,%3 

oe 

12xtx, 3x, xs 4. 6x 
n*(n — 1) 

Ox Kak Gxt 18x,xt 
nm | on n?(n — 1) 


12kyk3 | 24KiKy  BKGK,  L2KyKy | 12K§ 





ns n> 














n n*(n — 1) n — n®n—1)  nin—1)? 
15x2x, GOK,K,K2 60K, x} 
n§ n*n—1)° n(n—1)? 


15x32 = 9Ox2x? =: 180K,xK$ 120«§ 
ns % nn—1)° n(m—1)?  (n—1)8 














15x, «? 
n® 
SKEKe | 12KGKs 27K, Ks . 27K2K? x 18x 
ns ns nn—1) n{n—-1) n(nm—1)(m—2) 


REFERENCES 


Fisuer, R, A. (1928). Proc. Lond. Math. Soc. Ser. 2, 30, 199. 
KENDALL, M. G. (1940a). Ann. Eugen., Lond., 10, 106. 
KENDALL, M. G. (19406). Ann. Eugen., Lond., 10, 215. 
KENDALL, M. G. (1940c). Ann. Eugen., Lond., 10, 392. 
KENDALL, M. G. (1942). Ann. Hugen., Lond., 11, 300. 
KENDALL, M. G. (1943). The Advanced Theory of Statistics. I. Chas. Griffin and Co. 
McKay, A. T. (1932). J.R. Statist. Soc. 96. 695. 

PEARL, R. (1905). Biometrika, 4, 38. 

WisnHart, J. (1929a). Proc. Lond. Math. Soc. 29, 309. 
Wisnart, J. (19296). Proc. Roy. Soc. Edinb. 49, 78. 
WisHart, J. (1930). Biometrika, 22, 224. 

WisnHanrt, J. (1933). Biometrika, 25, 52. 






























[ 394 ] 


THE MOMENTS OF THE z AND F DISTRIBUTIONS 
By F. N. DAVID 


1. The cumulants of Fisher’s z distribution were derived approximately by Cornish & 
Fisher (1937) and exactly by J. Wishart (1947a@) some years later. In both cases, however, 
the assumption is that the parent population (or populations) generating the samples is 
normal. It appears simple, using Fisher’s k-statistic technique, to derive approximations 
to the cumulants of z and F when the two estimates of variance involved are based on 
independent samplings from two parent populations which may have any distributions 
whatever provided the cumulants exist. These approximations to the true cumulants can 
be used to investigate the effect of non-normality in the parent population on the distribu- 
tions of z and F, or to obtain approximations to the power of the z and F tests with respect 
to a set of specifically defined alternative hypotheses. It should be noted that in the type of 
problem we have in mind, the twe estimates of variance are essentially independent or at 
least uncorrelated. In many problems in the analysis of variance nen-normality in the 
parent population introduces a correlation between the estimates which are independent in 
the normal case (see E. 8. Pearson (1931) and R. C. Geary (1947)). We are not concerned here 
with this latter problem. 


2. It will be assumed that there are two populations 7, and 7,, each of which may be 


described by cumulants up to any order desired. For the cumulants of 7, we write K,, &g, - 
and for those of 7,«},«3,..... Following MacMahon we define the one-part symmetric 
functions as 


ecg 


n 
_ . >? 
8, = LX; 

j=1 


where n is the number of magnitudes involved. Hence if we imagine samples of sizes n, and 
n, randomly and independently drawn from 7, and 7, respectively, and if we associate with 


, 


each element of the sample a random variable 2, %9, ...,X,,, Xj, %3, ---,%,,, We shall have 
k 
om 2 - 
— log, yy 
2 
n n S 3s 
where a aes (2-3) 


for 7, and a similar interpretation may be given for k, and 7. 


3. We begin by expanding z: 
Ke ky — Ke ki — ks 
2z —log,— = log. (1 + *) - log (1 + a 
; Me a 7 
It is clear that expansion of the right-hand side will only be valid provided 
ky<2k, and kj <2kj. 


The general question of the validity of expansions of this type has been attempted only 
by J. B. D. Derksen (1939), but it appears possible to justify the use of such expansions 





fol 


ay 


‘ish & 
ever, 
les is 
tions 
ed on 
itions 
Ss can 
ribu- 
spect 
pe of 
or at 
1 the 
nt in 
here 


y be 
Za ery 


etric 


and 
with 


F. N. Davip 395 


for reasonably large n in an approximate sort of way. We argue for the expansion of 
log, (1+ (k,—K,)/K,) only, but it is obvious that the same kind of reasoning may also be 
applied to log, (1 + (k; —x3)/K3). k, is distributed with mean «x, and variance o],, where 





and, in fact, nm, may be so chosen that for some fixed positive integer r, 
Ke 2x3 \t 
r{—*+ as <Ke. 
nm, m—1 
Provided r > 3 it is clear that for reasonable-sized n, we shail have 
DS. 
Pik, > Kg+10},} <€, 


where € is some small positive fraction. Hence it appears, for n, of reasonable size, the ex- 
pansion will be valid, except in a small proportion of cases, for most distributions met with 
in statistical practice. An alternative approach would be to regard the distribution of k, as 
truncated at the point /, = 2x,, the moments derived being applicable to this truncated 
distribution. 


’ 


4. If we write — log, 3 = =a-—a@’, 
Ks 


=) 
> 


where a = log, (1 + ion and a’ = log, (1 44> 
2 Ke 


then since a and @’ are independent we shall have 
K. ’ 
8 (2) = blogs 5 + 46(a) — 38('), 


Kq(z) = oF = }(K,(a)+«,(c’)), 
Ka(z) = §(Ks(a) —K,(2’)), 
Kq(z) = pg(kg(@) + Kq(@’)). 


Further, since « and a’ have the same functional form, we may derive the moments of a and 
obtain the moments of a’ by adding the appropriate dashes. We consider then 





me Kig—Ky\ _ ky—K, 1 (kg—Kq)® | 1(Kg—K5)? _ 
a log. (14+-#"* )-AS*-5 a +3 a rhe 


and take expectations. Using the notation ~ 


E(ky—Ky)’ = (2°), 


it is immediate that &(a) = aa +33 a@ -—, ag )+- 








396 The moments of the z and F distributions 


From Fisher’s or Kendall’s tables of product cumulants and the tables of corrective factors 
recently given (David, 1949), we may write down immediately the right-hand side of the above 
expression, neglecting terms of higher order than 1/n?: 


Ee 1 1 Ky(mi—4n, +7) «3 (4(m,—2)(m,—7))  Kef m4 —7 
ty ge ee a ee) ae 


Sense) “33 vaso] Kg . Kk 8(n,—2) 


KA 4ni(n,—1)8 kh \ni(m— 12) gant n(n — 1)? 

















: : Kri2 
We introduce the notation y, = pos » f=, —l, 


and write the expansion of &(a) in terms of y and f, up to order 1/f?. Thus 


1 1 
E(x) = of, (2+ 12) + Top 4+ 18y,+ 16y}+ 474-973) 


1 
+ ia! — 42y, — 128y} — 32y, + 30y3 — 96 yyy, + 96a} — 376+ 247472— 3073]. 


The expression for &(«’) will be similar except that-f will carry the subscript 2 and the y’s 


will carry dashes. If we write 
Yr |! Ye _V 
"le fi fi 


we shall have 


K ] : SS &. : 
(2) = plogt—[ F247) | +34] al—4+ 181+ LOyt + ay. OM0 | 


1] 1 " 
+34 E (—42y,— 128y}— 32y, + 30y3 — 96y,7, + 96ye7i — 376+ 24y7,72— 2079) | 


5. The higher moments of z follow in a similar way. We use the expansion 


arth A grts A arts 
rei" “88 F 1) +R) “88H 1+ B)) 


where the coefficients A, , are given by Table 1. 





(log (1 +2))’ = A, 92”—Ay,ii 1 


Table 1. Values of A, , 
































m 
1 2 3 4 5 6 
8 
0 1 l 1 1 1 1 
1 1 3 ‘ 10 15 
2 2 li 35 85 
3 ; 6 | 50 225 
ae | 








F 


whe 


E(a 


F. N. Davip 397 
For &(a?) we have 


52 93 ae 
(a?) == 3 (2) a®) +pa®™ 5g I+ 
whence on substitution 


2 3 2 K,(3n?—9n, +13) Kg (3m,—19) «§4(n, — 2) (3n,— 19) 
2 4 1 oe | i 1 1 
€(a") = 1 ta, —18* 1) a2 Sn,(m,-1)®? _33n8(n,—-1) 2 3n,(n,—1)° 
pring 125n}+63n,+117) Kg Il  Kek, 25 25 | K5K3 88(n,—2) 
+A 12n2(n, — 13 Kil2n3kS 3n3 x4 3n3(n,— 1)? 
Kk} 100(n,—2) 137 wR 
Kg 3n2(n,—1)®  L2n3x$ 




















We may expand as before in terms of the y’s and powers of f,, from this write down &(a’*) 
immediately, correct each expansion for the origin not being at the mean, and finally obtain 


If . - Se . 
ea eh a er 
tal! (16 + 84y, + 3847? + 96y, — 92y2 + 352y,y, — 384y_ 73 + Lly,— 96yy724+ 12873 |. 


Further expansion and algebraic reduction gives 


K3 (z) -3|- 4+ 4yi+¥4- | 


i 
+75 al — 16 —64y? — 16y, + 12y3— 96y,y, + 120y,y? —3y,4+ 30772-4470) | 


K(2) = 75 lz (16 + By} + 32y371 — 487271 + Ye— 127472 + 20 | " 


It is clear that to obtain an accurate result for x,(z) it will be necessary to take further terms 
in the original expansion in order to be able to obtain the result to order 1/f*. 


6. The arithmetic involved in the expansions is heavy, and it does not seem possible to 
obtain a general check. We may note that if the two samples are assumed to have been drawn 
from different normal populations, then 

1 1 1 ] 

RR HR HF 

] 1 1 l 1 l 
K2(z) — oF, * of of aR 
a. Se 

fi ft A ft 

re 

K,(z) Ss RR +H 





6(2)— log? = — 
2 





When x, = «3, or when the samples may be supposed to have been drawn from the same 
normal population, the results agree with those giveri by Cornish & Fisher* (1938). We note 


* The Cornish-Fisher results can easily be calculated to any order in 1/f that is desired. See Wishart, 
(19474, p. 174). 








398 The moments of the z and F distributions 


the interesting property of the z-distribution, that all cumulants except the first are unaltered 
by a change in the ratio of the variances of the two normal parent populations, a fact which 
may be used to determine the power of the variance-ratio test against a set of specifically 


defined alternative hypotheses. As a further small check we may take only terms of order 1/n 
in o?, whence 


ap -Aat(1 42), 
4 \n, MN 


a result derived by Geary (1947). 


7. The exact values of the cumulants for the case when the parent population is normal 
have been given by Wishart (19472). Further to this he showed that a closer approximation 
to the true cumulants (after the first) was obtained by expanding in powers of 1/(f—1) rather 


than in powers of 1/f. For numerical work we shall therefore rewrite the cumulants in powers 
of 1/(f—1). Let 


r=f-1 =n—2. 
Then 


K l t 1 ; 
5(z)—} loge. me [err | + Ee + 24+ 16yi+ ty) | 


1 i 
. Ee (—4—84y, — 1607} — 40, + 483 — 9637, — 374 + 967 QV] + 2474 2 — 30) | 2 
171 1 1 l r x : 1 
K,(z) = q 7 (2+ Ye) ys 7a(— 8¥2—2%- 8yi + 5y3) 


+2 
[7 ‘ 
+ 7g | <a (8+ 168, + 1207, + 480y}— 152y3+ 352757, 
1 
+ Lly_— 967472 — 384y_y3+ 12873) | ' 
+2 
itl : 7? 
Kal2) =] a(— 44 47it%e— 37a] 
l ay 
+ ToL (— 8078-207, + 2478 — 907574 376+ 120774 + 3071744479 | oy 


LT 1 
K,(2) = l= (16 + 8y3 + 32y57, — 4872} + Ye— 12¥472+ 207) | + 
For cumulants of z for samples from a normal population we have, to order 1/r3, 


Loewe oe. a .g 
B00 @ ms Kee bee 


1 ] j 1 


a Cs 
l 1 
K,(z) = 273 * Op” 
a 
K,(z) — At 


Table 2 gives a comparison of the true values, from Wishart, with those obtained by sub- 
stituting in the formulae for normal parents immediately above. The case considered is 
fi, = 24, f, = 60. The agreement is satisfactory. 








Tab! 


F. N. Davip 399 
Table 2. Cumulant constants of the distribution of z (f, = 24, f. = 60; parent population normal) 








' 
Ky Ke K3 Ky 
Wishart’s exact values — 0-0127,429 0-0301,992 — 0-0007,998 0-0000,867 
Approximate values — 0-0127,431 0-0301,992 -- 0-0008,015 0-0000,871 




















8. Geary, in his 1947 paper, has discussed the effect of kurtosis on the distribution of z 
when both samples are drawn from the same population. As an illustration of the cumulants 
of z derived here we shall discuss the effect of skewness in the parent population for the case 
f, = 24, f. = 60. It will be assumed that the parent population may be graduated by the first 
three terms of the Gram-Charlier Type A series,* i.e. that 

1 ™ Y. nk ) 
ee a 2 J : 
where H,(X) and H,(X) are the third and fourth Hermite polynomials. This parent population 
has cumulants 


Ki, Ke, Ks, Kg Kj= 0, Kg =—10x$, K,= 35KyKs, Ky = -- 35x32. 
In order to eliminate as far as possible any effect of kurtosis we shall put x, = 0, when the 
only higher cumulant which has a non-zero value will be x, = — 10x3. Under this assumption 
and assuming further that x, = 1 the momental constants of the distribution of z for different 
degrees of skewness are as given in Table 3. «,(z|x,) is unaltered at the degree of approximation 
to which we are working. It would appear from a study of these moments that the effect of 
skewness on the distribution of z is likely to be small, and this is, in fact, the case. There are 


Table 3. Momental constants of z (f, = 24, fg = 60) when x, of parent population is zero 




















T 
Ki | 0-0 0-1 0-3 0-5 
os Pe es 
| yu(zixs) |  —0-012,743 — 0-012,826 — 0-012,992 — 0-013, 158 
MAz\Ky) | 0-030, 199 0-030,395 0-030,787 0-031,179 
o(z|x;) 0-173,779 0-174,342 0-175,462 0- 176,576 
K3(2|K5) —0-000,801 — 0-000,864 —0-000,988 —0-001,113 














various ways in which we can estimate the effect of this skewness. We could use the moments 
of z|x, to find the Pearson curve with the appropriate /’s, or follow the Fisher-Cornish pro- 
cedure and use Edgeworth’s series or fit a Gram-Charlier Type A of the functional form given 
above. This last procedure is approximate but will be adequate for our purposes. Thus if 


* I owe to Dr J. Wishart the suggestion that a more appropriate form for the population would be 


ae H,(X) H,(X) 10yiH,(X)\ 
~3x jee ee 
’ (1+ 31 tay 6: 








y=; 
/(27) 


I agree that for any systematic investigation of the effect of skewness in the parent population on the 
distribution of z it might be better to take this functional form. In the case above, however, I am 
specifying a particular population and my purpose is one of illustration only. 

_ Biometrika 36 26 








400 The moments of the z and F distributions 


Zoos is the deviate beyond which 5 % of the frequency might be expected to lie when x; of 
the parent population is zero, it is seen that we require to evaluate 


1 . V1(2|K3) [° ae 
wet 4x Yil2|Xs 5 har 
(27) sees cies)” ax+ 3! aoa eeeten) Oe) TiS . 
V [xl elxs>] Vv [xslz|xs)) 
Yalz|Ks) [° 1 4x 
vee sosa—rsin) A) Tega) ° an 
Vv [xa(3|x3)) 


remembering that 


ro) x 


9. As a check on the adequacy of the Gram-Charlier Type A we first refer to the tabled 
values of z, finding that 


Zoo5 = 026535 for x, = 0, f, = 24, f, = 60, 
and then proceed to find the tail area cut off by zoo, for these values using the expansion just 


above. We find 
@(x, = 0) = 0-05477 — 0-00439 — 0-00031 = 0-05007. 


Thus the representation of the z distribution by the normal curve plus the first two corrective 
terms of the Gram-Charlier A gives the probability integral correct to three decimal places, 
which will mean that it gives sufficient accuracy for our purpose. We repeat the procedure 
for x} = 0-1, 0-3, 0-5 and draw up Table 4. It is clear that a moderate amount of skewness in 
the parent population will not affect the z test appreciably for the case considered (f, = 24, 
f, = 60), and this will be true for higher values of f, and f,. 


Table 4. Tail areas corresponding to zoo; (kK, = 0) when x, +0. 





Ki 0-0 0-1 0-3 0-5 





® 0-050 0-050 0-051 0-051 























10. To investigate the effect of skewness in the parent population for degrees of freedom 
less than those taken in the preceding paragraph, the procedure is complicated a little by the 
fact that the cumulants of z are of order 1/n* only, and that the Gram-Charlier A does not give 
exact results. It is perhaps useful, however, to give such calculations as were carried out. We 
consider the case f, = 8, f, = 16, and take only the normal curve and the third Hermite 
polynomial to represent the distribution of z. Carrying through the calculations as before 
we may draw up Table 5. For x3 = 0, ® should be equal to 0-050 and not 0-051, the 


Table 5. Tail areas cut off by zy; (= 0-4760), f, = 8, f, = 16 





kK 0-0 0-1 0-3 0-5 





























the 


an 


_ of 


ed 


F. N. Davip 401 


inaccuracy being introduced partly by the moments and partly by the expansion. 
However, the figures are, I think, sufficient to show that for the case considered a small 
amount of skewness in the parent population will not affect the z test very much. 


11. The moments of F can be obtained by precisely the same method as was used for 
obtaining the moments of z, although the results are not as satisfactory and the algebraic 
manipulation required is very much heavier. We have 


k3—«$ 
a l 2 ‘) 
he al a 
ke rel 14 “82 a=") 
Ke | 


the dashes being given to the numerator this time to avoid much repetition. Thus 





Ke a 1 5ay_ | as), | poey_ 1 as 
Sar) = [1+ 52 gr tae gait 
and on substitution we have, reducing as before, 


é(F) = S147 +atatn(s—7 R +a)- Antal 


1 10 
rege (a- alt ”i( - A) +1 +17- nit a unt RE |- 
Collecting up terms we may write 


oe 
Ko} 
2: 
i 
HT 
i 


éF)=— 


1 
~ * 2+ 7)+7(4- Ye-Vi-—Yst 373) 


_ 


l 
+ 54 (8+ 5ya— 27+ Oy — 473+ 327271 + Yo— A0V2Vi— lOYaYat 1579 |. 


Similarly by expanding &(F?) and &(F*), correcting to moments about the mean, and 
collecting terms we obtain 


«s?T 1 1 3 
ot. = | (2+ y3)+3-(24+) I 16+ Ty. — Sy? — 2y, + 8y3) + a> (24+ ys) (24+ V2) 
F = ra Y. pa Y2)— fi +R Y2 Yi Ya Y2) pa a Y2) Ye 


4 %2_ 3y2(2+Y2) , (2+72) 
fi fife fifi 


1 
*R (88 + 37y, — 32y} + 38y§ + I6y3y, 4+ 3y,— 152y,yi— 387.724 6972) |, 
1 


7 K , 
n(F) = = | (Yat 4y {2+ 12y3 48) + 


(28 + Sy, — 16yj—4y3 + L5y3) 


+770 2+ 72) (2+ 78) +7 (16+ l2y.—4yi— Yq + 6y§) 
1 


1 ; A 6(2+ 72), , : mea 
a ila am by? — L2y3)+—- aT (Yat 471? + Liy,+8) 
1 2 
mo F , 90,2 — F ‘ 
ah (2 + Y2) (56 + 34y, — 20yj— Sy 2473) 
+ al (256 + 180y, — 208y? — 28y, + 168y2 + 96yyy, + 3y,— 204 yn y? —5lyqyet L16y3 |. 


f4(F) is of great length and complexity, and for this and reasons given in the succeeding 
paragraphs we have not written it out here. 


26-2 








402 The moments of the z and F distributions 


12. It is clear that except for large values of n, and n,, the expansion gives expressions 
which are not very good approximations to the true moments. Moreover, it is not thought 
that retaining higher powers of 1/f in the expansion will help very much. The numerical 
coefficients are increasing so rapidly that for small f the quasi-asymptotic series begin to 
diverge before they have converged to any quantity close to the true value. This is perhaps 
most easily seen if we consider the case of the moments of F when the two samples have been 
drawn from the same normal population. For this case the true moments are easily found 
to be (see, for example, Wishart (19475)), 


te 2f3(f1+fe—2) fo\* 8fi( fi + fe — 2) (2/1 +fe— 2) 
(F)=-22-. w(F)= F) =(2 . 
mi) = Fog Ml) = Fp oscp—ay Ml) (7) (fa—2)" fa —4) fa 


In the expansion in § 11 we write x, = x, and put all the other x’s equal to zero. We have then 








2 4 8 
AAR 
rw ey Ta 
8 2 1 6 a Pas 
fl F) = “t , - 








RR RARER PR 

These approximate moments may be checked by expanding the true moments as power 
series in 1/f,, and 1/f,, whence the leading terms are found to agree with the approximate 
values just given. Numerical substitution shows, however, that these approximate expres- 
sions do not agree very well with the true values even for f, = f, = 20. 


13. Possibly a better approximation to the moments of F when the samples are from any 
two parent populations may be obtained by the fc!lowing artifice. The moments of F when 
the samples are both from the same normal population are given by those terms inside the 
brackets which are not multiplied by any y’s. We know (Wishart, 19475) the exact values 


My(Fy), #e(Fy), #3(Fy), for which these values are only an approximation. Let us write, 
therefore, 


7 Ky , « 
6) = 2 | mile + Bat rit 16-3 
2 2 2 


l - 
+ Ri (572 — 2yj + 6y3— 473+ 3273744 Ye— 4 2yvi— 10yyy2+ 1578) | , 








3 o iy , 
op = 2] (Fy +% 2, %a_ Yay | zy, — yt Oy, + 8p) +e (Yi(2+ 72) + Yl2+79)) 
fh fhe fi fi ifs 
oe 3y3(2 +Y2) , (2+ 72) ) (28+ 972 — 16yj- — 4Y3+ 1573) — 56 
F fife AR 
l 
+55 (3772— 32y5 + 3873 + 9637) + 376 — 1527275 — ssrare+ re | 
, ker : 6(2 + Ye) (2+ 73) — 24 
ts(F) = — = (yat 4y374+ 12y3 a 12y,—4 +6 
fts(t") al g( Fy) + AM vi Y2)+ Tile + Fal Ye— 47 — Yat 673) 
a , 1», 6(2+Y0) (yj + 4y{2 + lly} + 8) — 96 
—— (24+ By? + 12y3) + ——__ t 3 
ea i Ria fits 
3(2 + Yo) (56 + 347. — 207} — 5y4 + 2473) — 336 


ff 
+ Fa 180y, — 208y} — 28y, + 168y3 + 96y, 7, + 3y,— 204y,y? —5lyyyet 1 1679) | : 
2 





then 


wer 
nate 
res- 


any 
Then 
the 
lues 
rite, 


F. N. Davip 403 


In these expansions the y’s play the roice «f terms which are correcting the true normal 
moments for the departure from normality. Provided y, and y, are less than unity these 
expressions should give reasonable approximations to the true moments of F when the 
samples are drawn from any two parent populations whatever. It will be noted, however, 
that in u,(F) the numerical coefficients are rather large, indicating that the approximation 
will not be too good. This remark holds good a fortiori for y,(F). 


I should like to thank Prof. E. S. Pearson and Dr J. Wishart for helpful criticism. 


REFERENCES 


CornisH, E. A. & FisHer, R. A. (1937). Revue de l'Institut International de Statistique, 5, 307. 
Davin, F. N. (1949). Biometrika, 36, 383. 

DERKSEN, J. B. D. (1939). Ann. Math. Statist. 10, 380. 

Geary, R. C. (1947). Biometrika, 34, 209. 

Pearson, E. S. (1931). Biometrika, 23, 119. 

WisHaert, J. (1947a). Biometrika, 34, 170. 

WisHart, J. (19476). J. Inst. Actuar. Stud. Soc. 6, 172. 








[ 404 ] 


t METHOD OF FREQUENCY-MOMENTS AND ITS 
APPLICATION TO TYPE VII POPULATIONS 


By HERBERT S. SICHEL 


National Institute for Personnel Research, South African Council 
for Scientific and Industrial Research 


Part 1. THEORETICAL 


The method of frequency-moments was developed primarily with the object of overcoming 
certain difficulties when fitting growth curves to observed data. In the author’s original 
paper (1947) examples were given on how to fit exponential, logistic and Gompertz functions. 
In the second part of the same paper it was pointed out that the method could also be applied 
to the graduation of frequency-distributions and examples of normal, lognormal and 
Pearson Type VII curves were presented. 

Yule’s (1938) investigation, based on an early suggestion of Karl Pearson, has come to 
the author’s notice recently. The fitting process described by Yule is essentially the same 
as my method of frequency-moments. The derivation of the standard errors given on the 
following pages is based partly on Yule’s work. 

In this paper we are more interested in the problem of estimating the parameters of a 
given type of frequency-distribution by the method of frequency-moments than in fitting 
the proposed distribution to a set of observations, though this aspect is dealt with in the 
last section. The present investigation has been limited to the case of frequency-distributions 
only, and the section comparing the maximum likelihood solution with the frequency- 
moment method has been confined to a Pearson Type VII population. 


Definition of parameters 


The nth frequency-moment of a population represented by a probability law 


yda = f(x) dx 
o+- 00 
will be defined as J, = N* ] y"dx, (1) 


where N is the total number of items in the sample and is equal to J,. Corresponding to (1) 
we shall define the nth probability-moment of the population as 


oa 


me hanes - 
a, =5= | y"dz. (2) 


In practice we are rarely in the position of estimating Q,, directly. However, it is com- 
paratively easy to estimate the parameter 


a n r a+? 4; n +0 \n r+ 
w, = (| ydz) +> ({ . ya) + ({ yda) = > 277, say. (3) 
—© i=1\J 940-454) b / i=0 


+— 





By 


sm 





1) 


HERBERT S. SICHEL 405 


By suitable choice of a and b we can make 


m= |" ydz and | ydx 
«2 6 


smaller thaw any preassigned e, so that in practice 


r itr i n r 
o,+ 3 ({ ye) = 3m? (3a) 
i=1 W gy 2865-1) 
Tr 
where r is the number of classes of equal width h = (b—a)/r into which the population has 
been subdivided. 
In the limit we have lim h}-" w,, = Q,. 
h—0 
For computational work it is convenient to make h equal to unity. Even for r as small as 15 
the approximation hw, =O, 


is reasonable for most cases provided n is of a low order. It may be improved by the use of 
a correction to be described in Part 2. The parameter w, is hereafter called the nth working 
probability-moment. 

Certain ratios of working probability-moments are defined as 


Wni2 

On = opt (n = 1,2,3,4,...), 

(4) 
Wrens) 357 

and ey = ofan (n = 4,3,3,4, ---)- 





The @,, coefficients may be used as measures of kurtosis. 
The dispersion of a population may be represented by parameters of the nature 


l 
Pra = Wn ’ (5) 


subject to certain limitations mentioned in Yule’s (1938) paper. 
When scale and location of a distribution are changed we transform the variate by 
z= ke+l. 
The transformed distribution becomes 


1 ,jz-l 
(2) =4f (=) 
r+1 ] (+ z— I n 
si Det Oa 
w,(z) = k'"a,. (6) 
It follows that the working probability-moments are independent of the constant / and have 
to be multiplied by k!-" if the variate is multiplied by &. They ure, therefore, semi-invariant 
under the transformation z = kx +l. i 


The «, coefficients are unaffected by the transformation, as can easily be shown by 
substitution of equation (6) into (4). 




















406 Method of frequency-momenis 


In the following, extensive use will be made of working probability-moments. In practical 
applications, however, we usually deal with working freguency-moments. In this fact lies 
the justification of the name suggested for the new method. 


Large sample mean values, variances and covariances of statistics 0, and a,, 


r+1 
We may estimate w, by 0, = > 27, (7) 
i=0 


where p; = f,;/N = observed proportion of observations falling into the ith class interval. 
Denoting the deviation of p; from its mean value 7, by 


op; = Pi-1 
we have for the mean value of 0, 


E(0,) = LE(p?) = LE (7, + dp,)" 
= LE (a? + nap dp, + mam} * dpi + ...), 
where m, is written for (*) . Neglecting higher powers of dp; as H(ép?) will be of order N-*, 


and using results obtained from the binomial theorem 


} 7 a(1l—7 
E(ép,)=0 and E(édp?) = TAC 80, 


we find E(o,) == E 4 “ (a? = m2) | +O(N-) 

= 2) -2 

= W, + N (@p_1 —W,) + O(N ). (8) 
For large N E(o,) =o, + O(N-). (9) 


Writing do, for the deviation of o, from its mean value E(o,,), we have 
60, = 0, — E(o,) = Xp? — Xn? + O(N-) 
= U[nai dp, + mame *dp;] + O(N). 
After squaring this expression and taking mean values we have 
var (0,) = n*[Laj"* E( Spi) + Z(a,m,)"* E(dp,dp;)] + O(N ~), 


where i +) and all permutations are permitted. On substituting the value for Z(dp?) given 
previously and also the covariance 


E(8p,dp;) = —“i74 


N > 
2 
we obtain var (0,) = 7 [=72" | =n?" _ X(7; 7;)") a O(N-2) 
n2 
= Fle... 08] + OW). (10) 


As an example, let us find the efficiency of the method of frequency-moments in estimating 
the standard deviation o of a normal population. For class width h = 1 


ow, =Q -(am) *@ —aaliiet don 
mee No(2m)) J-« 


"= (2707)10—») y-4, 





o, ¢ 


ar 














: HERBERT 8. SICHEL 407 
ical | o, expressed in terms of working probability-moments of order 3 (nm = 3 for reasons to be 
lies explained at a later stage; see equation (53)), is 

» Yo 
c= sua s. 
(7) ra!) Bos 
We can, therefore, estimate o by 3 = om es 0; *. (11) 
al. 1 /2 
Writing (11) as = sas [ E(0,) + d0,]-°, 
and expanding the bracket for small deviations of do, such that 
boy 
ee 
E(0,) 
1 /2 do 1 /2 
2 Paes 2 x Sw By o> Foe = 
, 8 sf gH [2 2 Fo) + ~} E(s’) 3a ae +O(N-). (12) 
It follows from (12) that s’ is a consistent estimator. 
Further, for small do, 8 
var (s’) = on@t var (0), 
and hence, from (10), : var (3’) = mY S[w.—w§] 
8) 9 
= = (}/2—-1)o%. (13) 
9) N 


Hence the efficiency of s’ is given for large N by 


1 
3 
o(75- 2) 
which is good in comparison with the efficiencies of other estimators sometimes used in place 


of the moment-statistics. 


n The covariances of working probability-moments are derived in a similar way to the 
variances. We find 


Eff. (s’) = = 0-916, (14) 


COV (0,,, On) = yr (Onsm—1 —@n%m) + O(N). (15) 


The @,, ratios of the working probability-moments may be estimated by 








On 
) Oy = oat (n = 1,2,3,4,...), | 
(16) 
Or0, 
g and ay = ofan (n = $33.4, a, 


Now if a, = $(0,, 0), we find in the usual way 


0¢ \*? 0d 0 0¢\? 
var (a,) = (£) var (0;) +258 cov (ona) + (52) var (0). (17) 








408 Method of frequency-moments 
This result is correct to order N-!. For n = 3 we find 


var (a;) = ¥ [dog + Dae} — af — 120cy xg] + O(N-*), (18) 


and for n = 1 var (a,) = y (9 + 16a} — a? —24a,a,]+ O(N-*). (19) 


First three exact moments of statistic 0, 


With a view of getting some idea as to the sample size N for which one may expect the 
statistics 0, and 0, to be normally distributed, it was originally intended to derive the first 
four exact moments of 0,. It was assumed that if for a given N the statistic 0, is nearly normal 
it would not be unreasonable to expect 0;, being of lower order, also to be normally distributed 
for the same NV. The algebra, however, was found to be extremely heavy. In the following 


the first three exact moments of 0, are given only. The fourth moment of 0,, derived below, 
is correct to order V-?, 


In general we have 





E(o,,) = LE(7,+ dp;)" = LE (a? + nn? dp; + un? dp? + ...), (20) 
and 
do,, = 0,, - E(o.,) 
= nin} |p, — E(dp;)| + mq Xa} *[dpj — E(Spi)] + My Ua} (dp? — H(Sp3)] + .... (21) 
1 
In particular, /;(0,) = E(0,) = w.+ vil — Wg); (22) 
being the exact mean value of 0,, and 
-1 
80, = 227, dp, + Xdp? nr =~ 


Hence the exact nth moment of 0, 
4 = - —] n 
fy (0g) = E( oz) = B(2 in, dp; + XLdp? + a] : (23) 
For the second moment we find 


Ma(0g) = LE (Spt) + DE (Sp7 dp}) + 4 Xin, E(dp?) + 4&7, ;E (dp, dp}) + 4 Xn? E(dp}) 





+4in;,7,; ;B(bp,dp,) +2») vx (op i) + as » (24) 


where i+j and all permutations permitted. In order to evaluate (24) we must know the 
central bivariate moments 


Lk, = E (Oph dp}) = _— E (aff off) 
of the multinomial distribution 


ee 


fifa! fa! ---f,! 


In general we require for the derivation of ,,(0,) central multivariate moments (n variables) 


minkal ... af, 


HMiecslegley...ken = (Opie Opie Spi ... Spf) 
1 7 
= etek tk, HOSE SFT Oi --- off), (25) 





whe 
int 


Th 


val 


23) 


Ss) 


5) 


HERBERT S. SICHEL 409 


where the various /’s can take on all the values 1, 2, 3, ..., r, 7 being the number of classes 
into which the population has been subdivided. The order of the multivariate moment is 
ki tketkyt...+k,. 


The joint moment-generating function of the multinomial distribution for the case of n 
variables is 


a labadah 


= (m, eh+m, ea +7, ets +... +7, ent), 26 
oor ey (meh +m, Ch +m, ee +... +m, e% +7) (26) 
where m= 1—m,—™,—™,—---—M- 


The expansion of this expression and the collection of appropriate terms of tf*/k,,! is straight- 
forward but laborious. The various multivariate moments derived from (26) are moments 
about the origin with respect to variables f,, = Np,.. They are denoted as vj, x, 4... Lhe 
following moments are needed for subsequent work. Writing t, j, k, 1 for 1,, l,, ls, l, and 
N® = N(N—-1)...(N-—r+1), we find 

Viooo = N27, 

Veoo0 = Na; + Nn}, 

V3900 = Na, + 3N@ 2} + N@n?3, 

Vann = Na, + 7N@ x} + 6N@ an} + NOzs, 

Veoo9 = N72, + 15N@n? + 25N@n? + 10NO@nt + NO, 

Veooo = N27, + 31N@n? + 9ON@n3 + 65NO nf + 15NOn' + NOnS, 
Viz00 = N@n,7;, 

Vi200 = N@ 2,2; + Nn; 73, 

Visoo = N@2,7;+ 3N@n, 75 + Nn, 7%, 


Vigoo = N@ 2,7; + 7Nn, 15 + 6NOn, 73 + NOn, 7}, 


r (27) 


V3e90 = N@ 2,71; + N@n,1,(1,; + 7;) + N@nin;5, 

V3300 = N@n,2; + N@n,1,(0, + 37;) + Nn, nj(37; + 1;) + NOnja;, 

N@n,17; + N©n,7,(1; + 17;) + N@n, 7770, + 67;) + NOn,1}(67,; + 15) + N@ni nh, 
Ving = N@nymy 7; 


, 
Y2400 


Vinig = N@n,1;7, + NOn nj m,, 
Vise = N@n, 7,7, + Nn,1;7,(7; +7) + NOn; 15 T, 


Vie09 = N@n, 2; 7, + NO, 1; 7, (7; + 1; + My) + NOn 1; Hy (1,1; + 1; My, + 157) + NOninF NZ, 





Vian = N“n,1;7,™,. ; 

The remaining moments are easily obtained by permutations of k,, k,, ks, k, and the 
corresponding 3, j, k, l. 

The central multivariate-moments V,,,,x, ...é, Can be obtained from the symbolic identity 

Vieskaky ..kn = (V+Nm,)™ (v+ N7,,)" (v+ N7,,)* ... (v+ N7n,,)**, (28) 

this being a generalization of the formula given by Kendall (1945) on p. 79. Finally, we have 


1 
Mey degkea on = JY Raab t Regt os ke, Males en* (29) 








410 Method of frequency-moments 


The transfer of the moments is again connected with a great deal of tedious algebra. The 


central moments required are: 


000 = 9, 
2000 =m —7;), 
3000 “# 7 ,(1 — 32, + 273), 
ao = Fa! 2(1 — 2mm) +35 m(1—7n,+ 120% — 673), 
Meeco = a” m3(1 — 40; + 573 9a) +. vi 7({1 — 15m; + 5073? — 6073 + 2474), 
Heo00 = Hy3 m3(1 — 37, + 33 — 73) + qa7(5— 367, + 83n} — 7823 + 2671) 
+ yam — 31m; + 1807? — 39073 + 36074 — 12073), 
Puro = — yi 
] 
/4200 = er — 27j), 
P1300 = — = 1,74(1 —7;)— wammfl— 67; + 6775), 
Musee = > 1,731 — 32; + 2n5) — yam 147, + 3675 — 2473), 
F2200 = m 47 (1 —1,—1; + 37; im) —ya7TY(1 — 27, — 27; + 67;7;;), 
P2300 = wit Bits — 61; + 5nj + 152,72; — 207,73) 


yam (7,(1 — 20, — 627; + 615 + 182,77, — 247,775), 


3 
He400 = 7a (1 — 1, — 20; + 15 + 62,7; — 572,775) 
ee See i Me eet 


~ 98 m1 j(1 — 2m, — 1421; + 3605 — 2475 + 420,77; — 1447; 75 + 1207; 773), 
Aine = i7;M x, 


nm = ai 7 ;7;,(1 — 373) + 


4 i m;77,(1 ig 37;), 


2 

ne” 
9 2 € \ 

Piss = —aa™ 1 ;11,(2 — 52; — 5m, + 207; 7%) + sy watt 7;7,(1 — 37; — 37, + 127;7,), 


1 
N 


1 


tye 7,7 ;7,(1 — 37; — 30; — 37, + 120,07; + 120;m, + 127,77, — 607,;7;7,), 


/yin = aM, 1.7 — PRUMMM. 


7} 


(30) 











. The 


30) 


HERBERT S. SICHEL 411 


The other central moments may again be obtained by permutations. We are now in the 
position to find ~,(0,) by substituting appropriate moments from (30) into (24), leading to 


4 2 2 
He(0g) = NV (@3 — 3) +a ls — 6w + 5w§) ys 2 4w, + 303), (31) 


the first term of which is the same as equation (10) for » = 2. Similarly, 4,(0,) can be found 
by using equations (23) and (30), i.e. 


8 ; 8 : 
M3(02) = v2 (4% + 5w3 — 9W.W3) — na (24m, — 4w, + 3w3 + 22w3 — 450,05) 
+ 
+ ya (8804 — 24m, +H, + 6403 + 1503 — 1440.0 ) 4 (32) 








+ 
ys (48004 — 163 + @, + 30w3 + 9w} — 72W_s), 


and to order V-? fM4(0g) = Fal 08)? + O(N-). (33) 


By virtue of equations (31)—(33): 


5w3 — Iwata)? : ‘ 
By (02) = (404 + 5073 — 9g05)" | OLN-2), Be{0,) = 3+ O(N), (34) 


(ws —w3)* N 


and for large V B,(0,)=9, (02) +3, 





which indicates the convergence of the sampling distribution of 0, towards normality. 
For the special law 1 

Baby bh ont ss 

ydx 10 J(an)° dx, 
for r = 15 (classes) of width 40 = 5 taken symmetrically from x = 0 we have w, = 0-139608 
if we consider the class width as unity in comparison to Q, = 0-141047 for r = 00. Further, 


W, = 0-022506, w, = 0-003848. 


A comparison is given below between the first three exact moments of 0, and their approxi- 
mations by the first terms in equations (22), (31) and (32) for a sample of N = 1000. 











| | ] 
| Moments | Exact | First-term approximation 
| 
14104) | 0-1396 0-1405 
tg 0g) 01227 x 10-4 | 0-1206 x 10-4 
| 124(09) | 0-5921 x 10-8 0-5756 x 10-8 
£ (02) 0-0190 | 0-0189 
| 





The approximation (34) to the exact value of /,(0,) is apparently satisfactory for NV > 1000. 
In view of the above, f,(0,) cannot be substantially different from 3 for N > 1000. Hence it 
may be concluded that for N > 1000 the statistic o, is reasonably normally distributed. 
04, the other statistic used in practical curve fitting, may also be expected to follow the normal 
law for N > 1000, as it is of lower order than o,. The same argument, however, will not hold 
in the case of a). Analogous to the moment ratio 6,, its sampling distribution will probably 
turn out to be skew even for large sample sizes. 








412 Method of frequency-moments 


Estimation of the parameters of a Pearson Type VII population from large samples 
Fisher (1921) has shown that the estimation of the parameters of a Pearson Type VII 
population T'(m) c2m-1 
dz = = dx 35 
Fees = Pm Diet @— OP _ 


is inefficient in the case of m< 10 if carried out by the method of moments. On the other 
hand, che efficient estimation of the parameters by the method of maximum likelihood is 
so cumbersoime, even using the modified procedure suggested by Jeffreys (1938), that it 
seems hardly likely that it will ever be adopted by the practical research worker. In the 
following sections an attempt has been made to show that the method of frequency-moments 
strikes a balance between the need for efficient estimation and the practical aspect of the 
method. 


Taking the parameters of (35) in the order £, c and m we find from maximum likelihood 
theory the Hessian determinant 














_ |(2m—1)m 
a= (m+ 1)c? . ° 
2m—1 l 
© mee ~ em 89) 
0 eG d* log I'(m— 3) _ d* log I'(m) 
cm dm? = dm? 


(for definition of Hessian see Kendall, 1945, vol. 2, p. 36). From the various zero cross-terms 
it can be seen that @ is uncorrelated with both é and m, where @, é and m are maximum 


likelihood estimates of £,c and m. For large N we have, therefore, for either single or joint 
estimation 





var (£) = Gn tar (37) 
For estimation of c when m is known and for estimation of m when c is known 
var (é) = atte. (38) 
a 1 
var (m) = ifn 3) — Fm DN’ (39) 
writing F(m) = tt. 


For the joint estimation of c and m the variances of the maximum likelihood estimators 
are given by 


- 1 
var (C7) = aq LF(m— 3) — F(m—1)] 








_ _m2(m + 1) [F(m —§) — F(m— 1)] e 
~ m>(2m — 1) [F(m —§) — F(m—1)]—(m+1) ~N (40) 
and var (m,) = Wa” mthe 
‘in m?(2m — 1) ak tai) 
m*(2m — 1) [F(m— 8) — F(m—1)]}—(m+1)° N’ 


All of these results, with the exception of (40), were derived in Fisher’s original paper. 








VII 
(35) 
ther 
1d is 
ut it 
the 


ents 
the 


ood 


(36) 


rms 
um 
int 


37) 


38) 


39) 


ors 


HERBERT S. SICHEL 413 


Denoting by a bar the estimators based on ordinary moments, we have 











Cc 
a. oa 42 
and for estimation of c and m when the other one is known 
m—-l ¢ 
A so —— 43 
var (¢) ‘sh’ ( ) 
_. (2m—3)?(m—1) 4 
= 4 
var (™m) (2m—5) N (44) 
For the joint estimation of c and m by the method of moments 
‘ (m — 1) (2m— 3) (8m? — 48m? + 108m — 83) c?® 
= — 45 
bah 3(2m — 5) (2m —7) (2m—9) *N’ (45) 
e _ — 2)2 (9m? — 
var (,) = 2(m — 1) (2m — 5) (2m — 3)? (2m? — 5m + 12) (46) 


3(2m — 7) (2m—9) N ; 
all of which were derived previously by Fisher (1921), with the exception of (44) and (45). 
In practice we only need to consider the following cases: 
(a) Estimation of &, 
(0) Estimation of c when m is known, 
(c) Estimation of c and m jointly. 
In almost every application we will have to deal with case (a). 

(b) arises in the case of estimating the scale parameter of an error distribution, especially 
if N <500. There is strong theoretical and experimental evidence that the errors of truly 
independent observations are distributed in a Pearson Type VII law with a shape parameter 
3<m<5. Jeffreys (1938, 1939, 1948) has pointed out that the estimation of m from the data 
of a particular experience becomes unreliable if N is less than 500. In that case it is advisable 
to assume m as being known a priori, in the light of previous experimental evidence, with 
a magnitude of 4, say. 

Most frequently, in practice, we do not know the magnitude of any of the parameters. 
We then have to estimate them from the data simultaneously, provided N is large enough. 
This is case (c). 

The method of frequency-moments lacks a location estimator. We, therefore, have to use 
either the mean or the median for estimating £. Problem (a) resolves into the question which 
of the two statistics, mean or median, is the more efficient one and for what range. From 
(37) and (42) we have, taking the maximum likelihood estimator as standard, 


(m +1) (2m—3) 











i ace 
Eff. (&) = = (47) 
where £ is the mean statistic. The variance for the median statistic is 
es 
ve = ayrg= aw rm - 
y ¥ 2 
and its efficiency Eff. (€) = a hh ~~ (49) 


mm(2m—1)LT'(m—4)] ° 
Table 1 gives some numerical values for the respective efficiencies. It can be seen that for 


m <3 the median statistic is the more efficient estimator for £. The exact crossing-over of 
efficiencies takes place at m = 2-840. 








414 Method of frequency-moments 








Table 1 
m Eff. (2) Est. () 
1 0-0 0-811 
1-5 0-0 0-833 
2 0-500 0-811 
3 0-800 0-769 
4 0-893 0-741 
5 0-933 0-723 
6 0-955 0-710 
7 0-967 0-700 
8 0-975 0-693 
G 0-980 0-687 
10 0-984 0-682 
co 1-000 0-637 

















In the practical applications of the second part of this paper, the median statistic was used 
for locating the curve whenever m, the frequency-moment estimator, was < 2-840. In all 
other cases the mean statistic was employed. 

Estimation problem (5) leads to the interesting discovery that, when N is large, the 
maximum likelihood estimator of c can be expressed in terms of simple working probability- 
moments. For the likelihood solution of (35) we have 

















7) 2m—1 N 1 
= = ~S See -—— = 6. 
dc Llogy eu wea? 1 c? + (x;-—£)? 
, ; 2m—-1 1f(c\lmN os 
Thi be written = — (<) Fea — : 50 
smay be written 8 om Wy) A Lie+ 0" 7m 
['(m) 
where = — —, 
Yo ~ aT (m-}) 
N N 
But for large N, D (fey e™=SN Y fle) fe) = Noms svim 
i=1 i=1 
, (2m—-1\" IT (m) = ” 
and (50) becomes c= eo *) JaT(m—}) Ovm +-1)/m = Cm+pD/m: (51) 
For a population parameter m = | we have 
A 1 y - 
é= Ino, = Cy, (52) 
A 9 y 
and for m = 2 é=—4=G (53) 
87104 


Hence it follows that we will obtain maximum efficiency by estimating c from frequency- 
moments of order 2 and 3 in the case of m = 1 and 2 subject to N being large. 


The variance of the frequency-moment estimator ¢ for small deviations and large N, is, 
from (51), 


oé =\2 
var (é) = 7 wr (On+1)'m) 
Om+Dim 
mc \? 
va ae a var (On + Dim)> 
from (10) (m+D/m/ 
: m+t ‘ l 7) ! 
é = 8 = ‘ (m+2) a 
var (O%m4-1)/m) ” N ( m [: Xm+2)'m — m4 Diml> var (c) = wow +7 2 rg on ~th. (54) 
ba +Dim 





(54) 


HERBERT S. SICHEL 415 





for Type VII law: o, = (Jac) Tea | rower P (55) 
Putting n = (m+ 2)/m and n = (m+ 1)/m and substituting into (54) gives finally 
2 
var (¢) = ate x (56) 


which is the same as (38), the variance of the maximum likelihood estimator. 

When fitting by frequency-moments in practice, the only sets of statistics used so far 
were J » J 2, and J 3and J » J, 3; and J 2 (Sichel, 1947). It is desirable to keep to this procedure in 
order to avoid computational complications and, generally, to make the method as simple 
as possible. 

Corresponding to the sets of statistics just mentioned we have for a Type VII law the 


frequency-moment estimators (52) and (53). For large N their respective variances are 
from (10) 








var (¢,) = (<) var (0,) = Flas ~1]e, (57) 
and var (&) = (=) var (04) = > [ay —1]e?. (58) 
By use of equation (55) we can express a, and @; as functions of m. Finally, from (38), (57) 
and (58), Ef. (é,) = a l | f(A # ae ) (wo) . |, (59) 
en ee 


The corresponding efficiency for the solution by moments from equations (38) and (43) is 


(m+ 1) (2m—5) 
(m—1)(2m—1)° 





Eff. (¢) = (61) 


Numerical values for the efficiencies (59), (60) and (61) are tabulated in Table 2. From this 
table we may draw some interesting conclusions: 











Table 2 

m loft. (&) Eff. (3) Eff. (¢) 

0-5 0-0 0-0 0-0 

0-8 0-993 0-904 0-0 

i 1-000 0-951 0-0 

2 0-962 1-000 0-0 

3 0-926 0-992 0-400 

4 0-903 0-981 0-714 

5 0-887 0-972 0-833 

6 0-875 0-965 0-891 
7 0-867 0-960 0-923 

8 0-860 0-955 0-943 

9 0-855 0-952 0-956 
| 10 0-851 0-949 0-965 
| fe @) 0-808 0-916 1-000 

















Biometrika 36 27 





var (m,) = 





416 Method of frequency-moments 


(1) For strongly leptokurtic populations of the Type VII law (low values of m) the fre- 
quency-moment estimators are substantially more efficient than the conventional moment 
estimators. 

(2) ¢, is a more efficient estimator than ¢,, having the remarkable property of varying in 
efficiency only between 0-92 and 1-00 for such a wide range as 1 <m<oo. 

It is the Type VII law of low m which is of practical importance. For example, a normal 
curve can very well represent a Type VII law even down to m = 7. For a sample size of say 
N = 1000 the x?-test would still indicate a very good agreement with normal theory even 
if the true population follows a Type VII law. For a sample size of N = 1000 we shall detect 
real leptokurtosis only if m <7. It is precisely in this range that the method of frequency- 
moments scores over the method of moments. It also compares very favourably with the 
maximum likelihood method because the efficiency of ¢ varies between 0-95 and 1-00 for 
1 <m<7, the computational work, however, being less laborious. 

Finally, we have to consider the most frequent estimation problem (c) when c and m 
are unknown. We have 


a, = $(m,). 
For small deviations of m from m and for large N 
‘i Oa,\—? 0 log a; ]-? 
var (m,) = (=) var (a3) = [ane var (a;). (62) 


var (a,) may be expressed in terms of m with the help of equation (55). (62) then becomes 





I'(m — 4) (stom —4)f T(2m) }?. 9F(2m-4)— Tm) ] 120(§m—4) (3m) P(2m) 
I'(m) 


T'(3m) _LT(2m—}) T(2m) LT@m—3) | Pgm)P@m—F)TQm-H)\ 





N{[F(m — 1) — F(m — $)] + 2[ (2m — 1) — F(2m— 3)] — 3[F(3m — 1)—F($m—3)]}* 


The efficiency of the joint estimator m, (frequency-moment method) is 


var(m,) equation (41) 








(mz) = = , 64 
ani i. var(m,) equation (63) (64) 

For the corresponding moment method 
Eff. (m,) = var(m,) equation (41) (65) 


var(m,) equation (46)’ 
The efficiencies (64) and (65) have been tabulated in Table 3. 








Table 3 

m Eff. (m%,) Eff. (™m,) 

0-5 0-0 0-0 

0-7 0-677 0-0 

1 0-758 0-0 

2 0-697 0-0 

3 0-651 0-0 

4 0-624 0-0 

5 0-607 0-169 

6 0-596 0-429 

7 0-588 0-594 

8 0-582 0-699 

9 0-577 0-769 
10 0-573 0-818 

oe) 0-537 1-000 




















(62) 


1e8 
(2m) 


m—4)} 





HERBERT S. SICHEL A4l7 


Again, the frequency-moment estimates are much better than the moment estimates in 
the important range | < m <7. There is a loss of information in comparison to the maximum 
likelihood solution. However, it is not as serious as it first may seem, because we deal with 
large samples, so that the variances will be comparatively small. Furthermore, given a 
particular sample after the sampling has taken place, it is the ratio of the standard errors 
rather than the ratio of the variances which is of practical importance. This value ./[Eff. (m,)] 
is quite reasonable for the range 1<m<7. On the other hand, the computational saving is 
considerable when using the new method. 

It is hoped that the frequency-moment method will appeal to the practical research 
worker as a reasonably efficient method which involves fairly simple arithmetic. 


ParT 2. PRACTICAL ILLUSTRATIONS 


It has been pointed out repeatedly that the method of moments will completely break down 
when, for a Type VII law, the shape parameter m < $. Jeffreys (1948) remarks: ‘In this case 
the expectation of the fourth moment is infinite; but the actual fourth moment of any finite 
set of observations is necessarily finite and any set of observations derived from such a law 
would be interpreted as having m > 3.’ Again, Fisher (1921) proved that the estimation of 
the parameters of a Type VII law by the method of moments becomes very inefficient except 
in the region near normality. 

Here, then, is a field for the practical application of the method of frequency-moments 
because all probability-moments exist and because the efficiencies of estimating the para- 
meters by this method are greater than in the case of the ordinary moment solution. 

In testing the goodness of fit we know that the contributions to the total x? are made up of 

(a) a portion deriving from ‘errors of estimation’; 

(b) a portion deriving from ‘discrepancies of observations from hypothesis’ (Fisher, 1948). 

The errors of estimation are dependent on the efficiency of the estimation process. In 
general, we should expect smaller x?’s when fitting a Type VII law with m<7 by the fre- 
quency-moment method than when fitting by the method of moments. This hypothesis was 
put to a practical test by graduating observed distributions first by the method of moments 
and then by the method of frequency-moments. 

Unless we employ fine grouping we introduce a bias into the estimation of parameters by 


equating the nth working frequency-moment statistic to the integral of the nth power of the 
probability law, i.e. 


r+1 +0 
hm DfeNe | yrde = Jy, 
i= — 2 


where h = width of class interval. This difficulty may be overcome by estimating the mid- 
ordinates u,; of the frequency groups by Hardy’s formula (1909) 


u; = f,-HAFiA. 


where f; represent the original frequencies. The sum of the nth powers of the midordinates 
is an estimate of J,, provided there exists high contact at both ends of the experience. 
An example will clearly illustrate the advantage of this procedure. 


The frequencies of 


ydx = 3000 | o-osz? dx 
Ja 


27-2 























418 Method of frequency-moments 
Table 4 
Class interval Frequency 
0-1 1643 
1- 2 1376-5 
2- 3 965 
3- 4 567 
4-5 279 
5- 6 115 
6-7 39-5 
7-8 11-5 
8-9 3 
9-10 0-5 
10-11 0 
Total 5000-0 
Total of entire 
distribution 10000-0 
| 











were arranged as shown in Table 4. Using uncorrected frequencies we have 
7 - , J? 
J, = 10000, J, = 11879366, p, = ; = 8-41796. 
2 
In a normal population o = 0-282095p,. Hence s = 2-3747 as compared with o = 2-3570. 
For the corrected frequencies (by Hardy’s rule) we find 


J, = 10000, J, = 11965747, pf, = 835719. 
Hence 89, = 23575, which is very near to the true value. 

For the purpose of testing the effect of grouping, two adjacent frequency groups were 
amalgamated and finally three groups, each set giving rise to two possible variations. When 
applying the midordinate rule it sometimes happens that there is a negative adjustment to 
a cell at the end of the experience which in the original series had no frequency. In such a 
case it is best to combine two or three groups and apply their total adjustments jointly. 

Table 5 gives a comparison of corrected and uncorrected standard deviations as estimated 
from the various groupings of the original frequencies of Table 4. The adjustments are not 


as effective as Sheppard’s corrections are to the raw moments, but in all cases they decrease 
the bias considerably. 




















Table 5 
~~ 
Class Uncorrected Bias Corrected Bias 
interval 8 8-—o Ss Scor. —F 

l 2-375 0-018 2358 0-001 | 
2 2-427 0-070 2-365 0-008 | 
2 2-427 0-070 2-365 0-008 
3 2-518 0-161 2-396 0-039 | 
3 2-512 0-155 2-389 0-032 








a eee 








were 
Then 
it to 
cha 


ated 
not 
ease 


| 


HERBERT S. SICHEL 419 


When fitting a Type VII law distribution by the method -of frequency-moments we must 
solve the equation ae rox — 4) Pam — of ram) 72 
P P@WT Om) LPGH—4) 
for m. Table 6 facilitates the computations to a great extent. The slight irregularities in the 


6* column are due to the uncertainty of the last digit in the calculation of a,. No attempt of 
adjusting the second differences was made. 






































Table 6 
m a; e+ o¢+ m ay + me ay c+ 
0-5 00 3-7 1-08289 6 9-0 1-06869 5 
0-6 2-09618 3-8 1-08218 4 9-5 1-06823 5 
0-7 1-54561 | 37382 3-9 1-08151 4 10-0 1-06782 4 
0-8 1-36886 9132 | 22688 4-0 1-08088 3 10-5 1-06745 4 
0-9 | 1-28343 3570 3737 4-1 1-08028 4 11-0 1-06712 3 
1-0 | 1-23370 1745 1056 4-2 1-07972 2 11-5 1-06682 3 
| I-t | 1-20142 976 391 4-3 1-07919 3 12-0 1-06655 2 
|} 1-2 | 4-17890 598 172 4-4 1-07869 3 12-5 1-06630 2 
| 1:3 | 1-16236 392 83 13-0 1-06607 2 
| 14 | 1-14974 269 48 4-4 1-07869 | 11 13-5 1-06586 1 
| 1-5 | 41-1398] 194 22 4-6 107777 8 14-0 1-06566 2 
| 1-6 113182 | 141 21 4-8 1-07693 9 14-5 1-06548 1 
| 1-7 | 41-12524 | 109 5-0 1-07618 6 15-0 1:06531 l 
ls | 111975 84 5-2 1-07549 5 
| 1-9 1-11510 66 54 1-07485 6 15-0 1-06531 5 
| 2-0 111111 53 5-6 1-07427 5 16-0 1-06501 3 
| 21 | 1-10765 44 5:8 1-07374 3 17-0 1-06474 3 
| 2-2 | 1-10463 36 6-0 107324 5 18-0 1-06450 3 
2-3 1-10197 29 | 6-2 1-07279 2 19-0 1-06429 2 
2-4 1-09960 26 6-4 1-07236 3 20-0 1-06410 2 
2-5 1-09749 21 | 6-6 1-07196 3 21-0 1-06393 2 
2-6 1-09559 19 6-8 1-07159 3 22-0 1-06378 
| 2-7 1-09388 15 7-0 1-07124 3 
| 28 1-09232 14 7-2 1-07092 i 00 1-06066 
| 2-9 1-09090 12 7-4 1-07061 2 
3-0 1-08960 ll 7-6 1-07032 2 
31 1-08841 9 7-8 1-07005 ] 
3-2 1-08731 8 8-0 1-06979 l 
3-3 1-08629 8 8-2 1-06954 2 
3-4 1-08535 6 8-4 1-06931 i 
3-5 1-08447 7 8-6 1-06909 2 
| 3-6 1-08366 | 4 8-8 1-06889 | 0 
| 3-7 1-08289 | 6 9-0 1-06869 1 























A detailed calculation of the various statistics involved in the fitting of a Type VIT dis- 
tribution by the method of frequency-moments is shown in Table 7. The observed experience 
is one of the marginal distributions of the bivariate table, given in Shewhart’s book (1931, 
p. 402), relating to the distribution of random noises (machine measures). 


Now . J ‘- a 
J, = X(u,)", 0, =F = 0-38603, a, = 3 = -¥,? = 1-08387. 
" Wf 1s J 


From Table 6 we see that for this value of a, 
3-5<m < 3-6, 


° ° e ~ ~~ 
and by inverse interpolation we find m = 3-574. 



































420 Method of frequency-moments 
Table 7 
r T 
, , (fi —fi)? 

— fi Af; Athy —FAZiA Uy (u,)? (u;)? fi fi-—fi cy ee 
0-1349 0 Ps 0-9 

0-1421 0 1 0 0-7 

0-1493 1 1 ae 0 1-0 1-00 1-00 1-4 |$— 3-7 | 0-999 
0-1565 1 : 71 —03 0-7 0-59 0-49 3-2 

0-1637 8 8 | -03. 11 21-33 59-29 15 

0-1709 | 23 15 9| -04 22-6 | 107-35 510-76 | 186 | + 4-4 | 1-041 
01781 | 47 24 26} —1-l 45-9 | 310-74 | 2106-81 | 45:5 | + 15 | 0-049 
0-1853 | 97 50 et 95-1 | 927-22 | 9044-01 | 101-7 | — 4-7 | 0-217 
0-1925 | 192 95 | _g2 | +426 | 194-6 | 2714-67 | 37869-16 | 181-8 | +10-2 | 0-572 
0-1997 | 225 33 | _g6 | 4+3-6 | 228-6 | 3456-43 | 52257-96 | 224-4 | + 0-6 | 0-002 
0-2069 | 172 | — - 22 | +09 | 172-9 | 2273-63 | 29894-41 | 180-0 | — 8-0 | 0-356 
0-2141 | 97 | — 23 | —1-0 96-0 | 940-80 | 9216-00 | 100-0 | — 3-0 | 0-090 
0-2213 | 45 | —52 2% | —1-1 43-9 | 291-06 | 1927-21 | 446 | + 0-4 | 0-004 
0-2285 | 19 | —26 14| -06 18-4 78-94 338-56 |] 182 | + 0-8 | 0-035 
0-2357 Te as = 12 | -0-5 6-5 16-57 42-25 1-4 

0-2429 7 e | — 6] +03 7:3 19-71 53-29 3-1 

0-2501 ‘Stet, 5| —02 0-8 0-71 0-64 1-4 144+ 15 | 0-167 
0-2573 = tee 1 0-0 0-7 

0-2645 0 0-9 

Totals | 942 0 0 0-0 | 942-0 |11160-75 | 143321-84 | 942-0 0-0 | 3-532 

: l 
































Further, we have 





1 [ Tm) Prraem—ty/ 


~ Jr LPoe—H) L PGm) I ° 
Hence é€ = 3-91345. 
Finally, J, D(a) 


i, 2 Se... = 228-66, 
Yo = © in Tm —4) 
and the equation of the fitted curve taking the class interval as unity 


y = 228-66(1 + 0-06529522)-3574, 


As m > 2-84 we use the mean statistic for locating the experience. Centring the frequency 
group 225 at zero on the z-scale and having the positive direction of the scale downwards 


in the f; column of Table 7, we find 
x = —0-011677. 


Ordinates were calculated at the beginning, centre and end of each group, and areas were 
found by a quadrature formula. The expected frequencies are given in the f; column of 


Table 7. For a x? = 3-532 we find for 7 degrees of freedom 
P = 0:83. 
A corresponding moment fit leads to the equation 
y = 215-52(1 + 0-02999222)-6227, 


with x? = 6-756 and P = 0-46. 


In this particular case the frequency-moment method gives, therefore, a better fit than the 


conventional method. 


BY a Type VvIz ATMPIOUtION 


Graduation of 15 expervences 


8. 


Table 
‘team numbers correspond to data de 





* listed under the references) 


spived from ‘Sources of Data 


MC *e 








421 


HERBEzxT S. SICHEL 























































































































| 
100-0 > 08-0 | 100-:0> 8£-0 100-0 > FI-0 | 100-0> | 100-0> | 100-:0> | 100-:0> 82-0 | 100-0> 97-0 9F-0 82-0 d 
99-0 £60 | 100-:0> oF-0 02-0 8L°0 62-0 10-0 40-0 80-0 1€-0 | 100-:0> £8°-0 oF-0 £30 d 
S8-910T | ZO-T 99-LIT | SL-Ol 82-89 LE-8 1-62 | 06-49 ZP-OZI | GI-SE 06°E 06-LL 9L°9 eL-¢ ¢8-¢ 3X 
g¢-ET 9F-0 OF-F9 LG-OT 90-9 8h°S 2-9 OF-1Z ¥o-SZ 89-61 69°€ Lg-8¢ €¢-€ F1-9 33% 2X 
PI-F8 PI-PIT | 99-689 | ZF-L8T 9F-OLZ | Zh-OP | Z9L9E | 96-981 | 69-886 | 96-496 | I8-SII | 9-626 | Z2-S1Z | EL-L9 | LO-€62 “h 
S8-POE | 96-8IT | 8E-819 | 98-06T Iv-P8E | 9F-09 | SESSH | C616 | PI-ILOL | PI-FZOL | 1Z-0ZT | [F-ZOOT | 99-822 | LL-ZL | LE-90E fh 
e-OI Xx 
6LZ00-0 | 0F1Z0-0 | 9900-0 | Li&h-€ 6£681-0 | 86L20-0 | 6E0LT-0 | 92090-0 | 6620-0 | 00990-0 | STSS0-0 | 8£990-0 | 66620-0 | LFLZ0-0 | FL4Z0-0 2-2 
e-OI X 
6L80Z-0 | €8L£0-0 | 600Z0-0 | £69%-F 9LEF-L | COFIT-O | 6Z819-0 | SEIZE-O | LLEGO-O | E9LL0-0 | 99LL0-0 | O8SFI-0 | O€990-0 | 90820-0 | 00180-0 t-? 
9ZL°% 890-9 b89-P P8E'8 SPP-E Pro FP 0&2-€ O€l-s 008 -€ GLEE 9IL-¢ 8c9-¢ L3e-9 98-9 | $90-FI ua 
s¢0°1 900-4 688-2 P6L‘9 ZEP-1 0L0-% OLLI 8hP 1 092-3 L6L:% 9F¢o-F 6EE-E PLO-E 192° | 88I-¢ eu 
OOOT | 008 £008 9c0I OL9 PLZ 000T 00L €10¢ OFSFE 268 S883 26 Lee £06 N 
| | bodies inn ode , 
| J9A0 pus 
| I {6z (ye oL+ 
id 9-61 + | I 0 (#3 19 6 + 
2 Sa G81 + | rd rd 2 (g I 8 + 
08 ¢ |i GLIt I (t g E £3 fl 0 L+ 
rat g | I g-9I+ {t 0 ¥ ¥ Lg cP L € 9 + 
OLZ LI im gett |! L ¥ L 86 GL € ai \y € + 
96 9% ng GPL + 8 9 £ 91 gcLI OLt id 29 61 ¢ | {¢ > + 
o-8 61 g @el+ tI If &% 9% Il? 69E IT 6 cP €I (1G ed 
ir | Le ¢ GZit 08 vz 6¢ 8 9FL ZOL LE 928 L6 0f 18 s+ 
at OF 6 Qitt 60T 1? 202 88 | $66 106 or C8L ZLt oP 603 [+ 
03 6h 61 gOlL+ Zee ¢9 b2P 6LT | 066 066 801 £86 ZZ cL c0€ 0 
ST PF 9I g-6 + 921 9F r6I Sgt é89 3e9 8L 88P 261 L9 zsI : = 
6 I 8¢ 6Z os + g ce 19 £9 6FE ZhE €¢ Sit L6 tPF 8L ae 
o-FI I 86 ee GL +] fet 91 val €P c6I LPI or 8g LP La us = 
6% & o-OFT 09 co + (€ fe {9 96 £6 cg 9 Il £3 61 € al 
LA3 9 9-291 69 gc + 49 \> 8I 8t 1€ I {8 (? | : 
£9 tI org rr | Gh + \; \T or FE or I IT . = 
8L ve ore | sel (oe + I € rI 6 ( I L- 
gLEl 69 99°F rgi G3 + 0 4 €I I g= 
9-602 +8 ¢-06¢ 8LI gIeo+ g | ¥& } 48 1; Jo 
& | | 4epun pus 
2 E82 PIT ¢-oRe P8I cO + € | ler \9 oI- 
| | | 
Zz ONIBA i | H | } | 2 enTBa 
(or) | (oO | (er) | (D fienuos] (ID | (on | () | (3) | (1) (9) | (9) | () (g) | @ | aaaiees 
! F! ! 
(se0uesej0z aYy4 JopuN poqsty .Byeq Jo se0IN0g, UlO’y PAALJap Byep 04 PUuOodsa.s109 ssequINU ULUNIO,)) 
UOUngiiisip TEA adit p fig saouariadxa cy fo uoyMnpMAH °S AGB, 
= g SSne@eeser% ~ a >, 2 D Ye 
rine 1 YY¥earosasnm © o > Ss £ O 2 





422 Method of frequency-moments 


Altogether fifteen observed distributions, which are both symmetric and leptokurtic, 
were fitted. They are reported in Table 8, and the sources from which they were drawn are 
given at the end of the paper. In some cases ((4) (6), (7) and (11)) two or more adjacent cells 
were combined where many groups were given in the original observations. Distribution 
(15) was reported in unequal class intervals. It is reproduced without alteration, hence the 
z-scale of Table 8 does not apply to it. For the estimation of the moments and frequency- 
moments of (15) the coarser class intervals were split arbitrarily so as to make the calculations 
at all possible. The x*-test, however, was applied to the original grouping as shown in 
column (15). 

Distributions (1)-(5) were located by their mean statistics as their m’s all > 2-84. The 
median statistics were used for locating distributions (6) to (11) as m< 2-84. Experiences 
(12)-(15) were originally reported without distinction between positive and negative 
deviations from the mean. It was assumed that their population means are zero. Con- 
sequently one further degree of freedom was allowed in testing for goodness of fit. 

For lack of space frequencies in some cases ((6), (7) and (8)) falling into cells + 10 and over 
and — 10 and under were lumped together. In the actual computations, however, the original 
tail groups were used. The brackets indicate the tail groupings employed in the y*-test for 
both methods of fitting. 

In cases where m <3 it was found necessary to compute more than three ordinates per 
group, at least in the centre of the distribution, in order to make the error of the quadrature 
formula negligible. 

It is well known that the x?-test for goodness of fit is exacting whenever N is very large. 
In such cases the probability P associated with an observed x? will often be < 0-05, although 
the actual fit seems to be quite good (Elderton, 1938, p. 204). For this reason it is better to 
compare the actual x?’s as derived from the two methods of fitting instead of their respective 
probabilities. 

A comparison of the y? rows of Table 8 shows that the method of frequency moments 
yields a better fit in fourteen out of fifteen Type VII distributions examined. The reduction 
of total x? is substantial in ten out of fifteen cases. As the various cells used were identical 
for both methods, and as the amalgamation of tail groupings for the y*-test were kept the 
same, it is reasonable to assume that the almost ail-round lowering of y*’s is due to a 
reduction of errors of estimation. On theoretical grounds we have expected such a result 
previously. 

For distributions (6) and (7) Jeffreys (1939) has estimated the scale and shape parameters 
of a Type VII law by his modified method of maximum likelihood. A comparison between his 
method, the frequency and the ordinary moment methods is given below: 





























ca Method Modified ee 
Distribution ~ maximum a ae Moments 
Statistic _ likelihood ere 
a 
(6) For shape mm 2-710 2-797 3-272 
” scale é 3-678 3-591 4-226 
(7) For shape m 2-257 2-260 3-300 
» scale é 3-295 3-266 4-823 














o 
b 
r 
t 
F 
C 
( 











' 


HERBERT S. SICHEL 423 


Normally, we should not employ the method of frequency moments for the estimation of 
o when dealing with a normal universe. There is, however, one exception first pointed out 
by Yule (1938). We are sometimes confronted with an experience which appears to be 
reasonably normal except for one or two outlying observations. If the material from which 
the sample has been drawn is normal, we overestimate the moments due to the dispro- 
portionate influence of the outlying observations. On the other hand, it is wrong to omit any 
observation unless we have evidence of a real blunder in determining or recording an 
observed quantity. 

It is in such a situation that we may prefer the frequency-moment method to the more 
efficient method of moments (for a normal population) as the former method weights the 
tail ends far less than the centre of an experience whereas the opposite is true of the moment 
method. 

As an example Pearson’s bright line no. 2 experiment (1902) may be quoted. For the 
moment solution, we find 

b, = 0-00178, b, = 5-02565, 


which suggests a Type VII law. The fitted equation was found to be 
y = 100-88(1 + 0-0366242?) 3981, 
giving the rather poor fit, P = 0-04. 



























































Table 9 
| | N 1 | 
. i . | orma | 
Central fi Type VII Deviations Normal Deviations | (Freq. Deviations | 
value | (Mom.) | (Mom.) Mom.) 
+21 1 | i 
+19 0 
+17 0 0-4 il | 
+15 0 o2 | — 43 4 —2-3 — 07 
+13 0 05 | 0-1 0-1 
+B, | 0 1-0 0-4 0-2 
\ +9 | 0 2-0 1-4 1-0 
+ 7 3 4-2 4-4 3-4 
+ 5 8 9-0 — 1-0 11-1 —3-1 9-5 — 15 
+ 3 31 19-0 + 12-0 24-0 +7 22-2 + 88 
+ 1 35 37-3 — 23 43-5 —8-5 42-5 — 75 
- il 73 64-6 + 84 65-4 + 7-6 66-8 + 6-2 
-— 3 76 91-2 — 15-2 82-8 —6-8 86-2 — 10-2 
-— 5 96 9-1 | — 31 87-2 +88 91-5 + 45 
- 7 79 81-7 — 27 77-1 +1-9 79-7 — 0-7 
- 9 60 52-9 + 71 56-7 +3-3 57-0 + 3-0 
-ll 30 28-8 + 1-2 35-0 —5-0 33-5 — 35 
-13 17 14-2 + 28 18-1 -—11 16-2 + 08 
-—15 5 6-7 7:8 6-4 
—17 3 3-1 2-8 2-1 
—19 1 1-5 0-9 0-6 
—21 0 0-7 , — 29 0-2 —1:8 0-1 , + 08 
— 23 0 0-4 : 0-1 
— 25 0 0-2 
—27 1 0-3 
P 0-04 0-44 0-56 | 























424 Method of frequency-moments 
For the method of frequency-moments we have 
a, = 1-06185. 


From Table 6 we see that m must be very far above 22. As the difference between a normal 
curve and a Type VII curve of m > 30 is very minute we should proceed to fit a normal curve 
by the method of frequency moments. This was done giving 


y = 92-57e—oosenez* 
with the good fit P = 0-56. , 
Had we known a priori that the sample came from a normal population we should have 
fitted a normal curve by the method of moments leading to the equation 


y = 88-27¢-0-00086x? PP — 0-44, 


The original observations and the resulting three fits are given in Table 9. 

Inspection of Table 9 shows that the frequency-moment fit is better than the ordinary 
moment solution as 

(1) P is larger; 

(2) absolute deviations are smaller in ten out of twelve cells; 

(3) there are nine changes of signs as compared with six in the moment fit. 

The use of Hardy’s formula in the practical applications of the method of frequency- 
moments introduces certain inconsistencies with regard to the standard errors and efficiencies 
as derived in the theoretical part of the investigation. It is confidently felt, however, that the 
discrepancy between the practical and theoretical approach is small, just as, in practice, 
one works so often with Sheppard’s correction without taking it into account when estimating 
standard errors. 


It gives me great pleasure to express my thanks to Mr J. E. Kerrich for much helpful 
criticism, to the South African Council for Scientific and Industrial Research for permission 


to publish this paper, and to the staff of the Statistical Section for assisting me in heavy 
computational work. 


REFERENCES 


ELpErTonN, W. P. (1938). Frequency Curves und Correlation. Cambridge University Press. 

FisHErR, R. A. (1921). On the mathematical foundations of theoretical statistics. Philos. Trans. A, 
222, 309. 

Fisuer, R. A. (1948). Statistical Methods for Research Workers. Edinburgh: Oliver and Boyd. 

Harpy, G. ¥. (1909). The Theory of the Construction of Tables of Mortality and of Similar Statistical 
Tables in Use by the Actuary. London: Institute of Actuaries. 

JEFFREYS, H. (1938). The law of errors and the combination of observations. Philos. Trans. A, 237, 
231. 

JEFFREYS, H. (1939). The law of errors in the Greenwich variation of latitude observations. Mon. 
Not. R. Astr. Soc. 99, 703. 

JEFFREYS, H. (1948). Theory of Probability. Oxford: Clarendon Press. 

KENDALL, M. G. (1945). The Advanced Theory of Statistics, 1. London: Charles Griffin and Co. Ltd. 

Pearson, K. (1902). On the mathematical theory of errors. Philos. Truns. A, 198, 235. 

SHEWHART, W. A. (1931). Hconomic Control of Quality of Manufactured Product. New York: van 
Nostrand. 


SicueEx, H. 8. (1947). Fitting growth and frequency curves by the method of frequency moments. 
J.R. Statist. Soc, 110, 337. 

Yue, G. U. (1938). On some properties of normal distributions, univariate and bivariate, based on 
sums of squares of frequencies. Biometrika, 30, 1. 








rve 


ary 


cy- 
‘ies 
the 


ce, 
ing 


ful 


ion 
ih 


an 
ts. 


on 


HERBERT S. SICHEL 425 


SOURCES OF DATA 





(1) Terman, L. M. (1928). Intelligence quotients. From The Measur t of Intellig London: 
Harrap and Co. Ltd. 
(2) Nav. Inst. Pers. Res. Visual discrimination scores. Unpublished research data. 
(3) SHEwnart, W. A. (1931). Random noises. From Economic Control of Quality of Manufactured 
Product. New York: van Nostrand. 
(4) Micuetson, A. A. and others (1935). Light velocities. J. Astrophys. 32, 56. 
(5) Fuemincer, M. E. Output deviations. Unpublished research data. 
(6, 7) Hutme, H. R. & Syms, L. S. J. (1939). Variation of latitude. Mon. Not. R. Astr. Soc. 99, 642. 
(8) Hansmann, G. H. (1934). Distribution of third moment coefficients. Biometrika, 26, 129. 
(9) StcuEn, H. S. (1947). Sampling experiment on Type VII population. J. Roy. Statist. Soc. 110, 
337. 
(10) Kerricu, J. E. Fortnightly gains of oxen. Unpublished research data. 
(11) Mruzs, F. C. (1938). Prices and commodities in 1927. From Statistical Methods. New York: 
Holt and Co. 
(12) Bonn, W. N. (1935). Variation of judgement on the position of fuzzy object. From Probability 
and Random Errors. London: Arnold. 
(13) Pearson, K. & Mout, M. (1927). Distribution of tetrads. Biometrika, 19, 246. 
(14) Besse, F. W. (1838). Errors of right ascensions. Astr. Nachr. 15, 358. 
(15) Pearson, E. S. (1929). Student’s ratio for samples of two. Biometrika, 21, 259. 








[ 426 ] 


ON THE USE OF STUDENT'S t-TEST IN AN ASYMMETRICAL 
POPULATION 


By 8. G. GHURYE 


On account of the unique property of samples from a normal population that the ratio 
(¥—p)./(n+1)/s (where y is the population mean, Z ay x;/(n+1) and ns? = X(x—Z)?*) 
i=1 
is the ratio of a normal deviate to a stochastically independent estimate of its variance, 
Student’s t-test is a suitable test of significance for the mean of a normal population. However, 
in a variety of cases, it is necessary to test for the mean of a population which does not follow 
the Gaussian law. Efforts have, therefore, been made to see how far Student’s distribution 
may be used for the purpose in non-normal populations. Due, mainly, to the analytical 
difficulties of the problem, no extensive theoretical discussion has yet been given. Thus, 
Pearson & Adyanthaya (1929), Rietz (1939) and Nair (1941) have given experimental 
treatments, while the theoretical discussions of some others (Rider, 1929; Perlo, 1933; 
Laderman, 1939) have dealt only with trivially small sample sizes. The papers by Bartlett 
(1935) and Geary (1936, 1947) give results true for any sample size, though they are based 
on certain assumptions and approximations. The present paper deals with the population 
considered by Geary in his 1936 paper, subject to the same approximations. The second 
contribution by Geary (in which is derived the ¢-distribution in samples from a population 
which departs more from normality than that considered in the 1936 paper) came to my 
notice too late to be made use of in the present work; but it is proposed to consider it later on. 

Geary (1936) has obtained the distribution of the ratio (¥—) ./(n + 1)/s in the case of an 
asymmetrical population, whose fourth and higher cumulants are zero, by neglecting squares 
and higher powers of the third cumulant. We know from this how far the probability of an 
error of the first kind (i.e. the probability of rejecting the null hypothesis when it is true) in 
such a population differs from that for a normal] distribution, provided we may neglect the 
square of the standardized third cumulant y,. Here again, on account of analytical difficulties, 
it is not possible, except for very small sample sizes, to consider the effect of terms containing 
higher powers of y,. However, we can assume the result derived by Geary to be correct for 
very small values of y,, as also for large sample sizes—but in such cases the deviation from 
values of the normal theory is practically negligible. Even then, it is of interest to know 
whether, in using the usual tables of the t-test (based on the normal distribution), we are 
committing the greater error in the probability of an error of the first kind or in that of an 
error of the second kind. In the present paper are derived the values of the probability of an 
error of the second kind (and hence, of the power of the test) when the usual t-tables are used 
to define the critical region. 

It may be mentioned here that this problem is only a special case of a general investigation, 
on which the writer is engaged, into the effect, on statistical tesis, of differences between the 
actual and the assumed distribution laws of the universe sampled. The solution of these 
problems is hampered by analytical difficulties in the derivation of the probability laws 
(and particularly of power functions), and the present case is one of the few in which a mathe- 
matical, though only approximate, solution has been found possible. 








h 


ratio 
- %)*) 
ince, 
ver, 
low 
ition 
tical] 
hus, 
ntal 
933; 
tlett 
ased 
tion 
‘ond 
tion 
my 
‘on. 
f an 
ares 
fan 
+) in 
the 
‘les, 
ling 
for 
‘om 
10W 
are 
"an 
Pan 
sed 


on, 
the 
ese 
Ws 
he- 


ee ee) re 





S. G. GHuRYE 427 


EXPRESSION FOR THE POWER FUNCTION 


Let the variate x have a mean #1, standard deviation c, third cumulant «, = y,0°, and all 
higher cumulants zero; y, is assumed to be small, squares and higher powers being neglected. 
The distribution function of z is given by 


dF) = Taps! +3 C- 36) exp (— AEA, 


=F Again neglecting y? and higher powers, the probability of a sample of n + 1 





x 
where £ = 


independent values 2,, X9, ...,X, 4 is given by 
(2m)-¥e+0/1 +2 (EE — 35¢)h exp (— EEN AE... dyer: 


The part not containing y, is the same as for the normal population. In what follows, we 
are assuming, that the table of the é-test based on the normal distribution is used for the 
significance level, i.e. for the probability of an error of the first kind. The contribution of the 
part not containing y, is, therefore, the same as for a sample from a normal population, for 
which the power of the t-test has been considered by Neyman et al. (1935) and again by 
Neyman & Tokarska (1936). Hence, we shall consider below only the additive part 

Y 
6(Qmykwrn (ES? — 3ZE) exp (— $22?) dé, ... df... (1) 

The joint distribution of Z and s can be obtained by the substitution used by Geary (1936), 

and is found to be 








dF(u, x) = —s ta fu + 3ux? -- 3(n + 1) u} x" texp{— (x? + u®)}dudy, 
Qkn—2) jf  * 
2 ( 5 |) 6 Jt2mn+1)} (2) 
ai. To =e 2 
where e = (—/4) vn +9 and x?= ~~" = = 
oc o o 


For testing the hypothesis 7 = tg, the é-test is uniformly most powerful for the class of 
one-sided alternatives > Mo OF jt < fly. Thus, to test the hypothesis “4 = fy (or <9) against 
the set of alternatives 1 > 49, we find the value t of Student’s ratio such that the probability 
of exceeding it is a predetermined a (i.e. the value of ¢ in Fisher’s tables corresponding to the 
probability 2x); we then reject the hypothesis y < jy if the ratio (— 9) ./(n + 1)/s exceeds ¢, 
and accept it if it does not. In both these decisions we may be wrong in that we may decide 
that > 9, when, in fact, ~ < jy (error of the first kind), or we may accept # < #, even when 
jt = fl, > My (error of the second kind). From Geary’s paper we can calculate the difference 
between the actual value of the probability of an error of the first kin 1 and the value a 
assumed. To obtain the corresponding correction for the probability of an error of the second 
kind (and hence for the power of the test), we have to integrate the expression (2) over the 
(=a) (n+) 2, 

8 





region of acceptance, i.e. over the domain conditioned by 


ft 


ie. by uc_p where p, = PrP (n+ 1) and ¢ is a function of «. 


Then the probability of an error of the second kind is 
Pjy(a, N, Py) = PYi(%, 2, Pp) — V1 Pir(a, 2, Pp); (3) 








428 On the use of Student’s t-test in an asymmetrical population 


where P%, is the Pj; for the normal population and — y, Pj; is the contribution of the terms 
containing only the first power of y,. The power function (1 — Py) is then (1— P§, +7, Pi), 
so that y, Pj; is the correction to be added to the ‘normal-theory’ value. Then 


1 al @ P(tx/ Vn)— pn 
= Kn—2) a=} ‘ : {. i {u8 + 3ux? — 3(n + 1) u} x”- 
Qun (* > !)6 V[2a(n+1)] exp{—}(x2+u*)} dudx. 
Now, the double integral 





) (tx/V2)— pn 
-| exp (— 4x?) 4 | {u8 + Sux* — 3(n + 1) wsexp (— ju du | dx 
0 —2 


“ Bas t 
s -| {2+ ug +3(x?—n+1)}x" texp{—4(x?+up)}dx, where Mo = in Pe 
0 





- =o -B) [erent seo) 
_ _ &xp(—p;/2 


2 
qt) =f" (Aw? + 4bw + B) (w+ 6)" exp (— $w*) dw, 
-b 


2 2 
in which a@=14e, b= Pe 4=0'42- and Bat err 





a./n’ 
2 
Hence Pi, = xe Pn/2a*) F, (4) 
vatiie: - 7) 6 ./(n+ ere 
where I= a. Is r (Aw? + 4bw + B) (w+ 6)" exp ( — }w*) dw. (5) 
V(2m)J—» 


Thus, the problem of finding the values of y, Pj, the contribution of the first power of 
Y, to the power function, reduces to that of the evaluation of ¥,. 


EVALUATION OF Pi; 


By expanding the binomial in (5), 4%, can be expressed as a linear function of n + 2 incomplete 
r'-functions. However, the best method of evaluating 4%,, at least for the values of n con- 
sidered below, seems to be that based on a reduction formula. 


Let 
~ {° (w +b)" exp ( — $w*) dw, (6) 
so that J, = AK,,,,— 20°K,, + (p2 — 3n— 1) a®K,,_}. (7) 
From (6), we get by partial integration 
K,, = 6K, erat e (8) 


ee by f the normal probability-density Jean x -exp (— $6?) and by F the distribution 


m1) 
function Jen vont. exp (— }u®)du, we have K, = F and K, = 6F +f, from which successive 


K’s can be evaluated by using (8).* 


Then, from (7) and (8), J, =1K,+mK,,,, (9) 
where l= b(2—a?) and m= 2n+(p2—2n—1)a?. 


* It is to be noted that K,=n! Hh,(—b)/,/(2m), where Hh, is the function tabulated (for n< 21) 
in Table XV of the British Association Mathematical Tables, Vol. I. I am obliged to Dr Harold 
Hotelling for pointing this out to me. 








rms 


tt)» 


ve 


9) 


l) 
ld 


S. G. GHURYE 429 


The values of P};(«,n,p,) were calculated, by way of illustration, for « = 0-05 (Table 1) 
and x = 0-01 (Table 2), n = 4, 9 and 19 and all integral values of p,, for which the value of 
Pj; exceeded 0-001 in absolute value. Geary (1936) has tabulated the values of his correction 
to the distribution function for these values of n. The value of — P}, for p, = 0 must, of 
course, agree with that in Geary’s table for the same n and the appropriate ¢. The accuracy 
of the results is conditioned by the accuracy of the value of ¢ obtained from the probability 
table, but the effect of subsequent errors due to rounding off has been rendered negligible 
by retaining six or seven significant figures in the calculations. Under the circumstances, the 
results given should be correct, except for the effect of rounding-off errors on the fourth place 


of decimals. 
Table 1. P}, for a = 0-05 





























\\ | 
\ Pa 
\ 0 1 2 3 4 5 6 
n \ 
| 
4 | —0-0343 —0-0775 — 0-0361 + 0-0559 + 0-0649 + 0-0260 +0-0051 
9 | —0-0297 — 0-0628 — 0-0046 + 0-0597 + 0-0340 + 0-0062 — 
19 — 0-0229 — 0-0450 + 0-0048 + 0-0445 +0-0191 + 0-0025 —_— 
| 





Table 2. P}; for a = 0-01 



































I\ 
| \ Pa | 
1. 0 1 2 3 4 
D> ) i 
| 
.] ~0-0105 ~0-0418 ~0-0771 —0-0619 + 0-0066 
9 ~0-0115 ~ 00509 — 00753 +0-0012 +0-0731 
19 ~ 00098 ~0-0444 — 00523 +0-0270 + 0-0569 
vise Bec 5 
Pn 
5 6 7 8 9 
* \ 
4 | +0-0616 + 0-0646 +0-0387 +0-0156 40-0045 
9 | +0-0496 +0-0130 +0-0016 fe? om 
| 19 | +0-0207 +0-0026 ns A nes 














EFFECT ON THE POWER FUNCTION 


Following Geary’s 1936 paper, it has been assumed that the y, of the population in question 
is sufficiently small for our purpose; but it has not been found possible to define the range of 
permissible values of y,. It is proposed to consider this question, as also that of improving 
the approximation to the power function, in a further publication. 

In order to get an idea of the magnitude of the effect of departure from normality, the 
values of the power for y, = 0 (normal distribution) and y, = 0-4, n = 9 and 19 and some 
values of p,, are given in Table 3, assuming that the effect of higher powers of y, is negligibie. 
In this respect, it may be mentioned here that on the wider assumption used in Geary’s 








430 On the use of Student’s t-test in an asymmetrical population 


1947 paper the actual probability of ¢ lying to the left of the lower 2-5 % point of normal 
theory is 0-041 in a sample of 10 from a Pearsonian population with y, = 0-5 and y, = 0, 
whereas, according to the assumption on which the present p..per is based, the same 
probability is 0-035. It has, therefore, been assumed for ‘he present that the approximation 
used in the present paper also gives a fair estimate of the power function when y, = 0-4. 


Table 3. Comparison of the power when y, = 0-0 and y, = 0-4 





a=0-05 a=0-01 








n=9 n=19 n=9 n=19 








¥1=0-0 i= 0-4 %1=0-0 ¥1=0-4 Pr %1=0-0 Y1=0-4 11=0-0 W1=0-4 





0-050 0-038 0-050 0-041 0 0-010 0-005 0-010 0-006 
0-236 0-211 0-248 0-230 1 0-071 0-051 0-081 0-063 
0-580 0-578 0-612 0-614 2 0-268 0-238 0-320 0-299 
0-868 0-892 0-894 0-912 3 0-587 0-587 0-677 0-688 
0-979 0-993 0-986 0-994 4 0-853 0-882 0-916 0-939 

5 0-969 0-989 0-989 0-997 



































The effect of positive skewness is to decrease the power in the region of less power and 
increase it everywhere else, the opposite being true of negative skewness. It is seen from 
Table 3 that the change in the power is not such as to affect materially the inference drawn 
from the test. How far this is so for greater departures from normality, it is not possible to 
determine by the present method. 

ConcLUSION 
It has thus been found possible to obtain an expression, though approximate, for the power 
of Student’s t-test applied to samples from an asymmetrical universe when the critical region 
has been determined on the erroneous assumption of normality in the parent population. 

The results derived are true if the fourth and higher cumulants are zero and the stan- 
dardized third cumulant sufficiently small; subject to these restrictions, the change in the 
power of the test is found to be negligible as far as the inference to be drawn is concerned. 


My thanks are due to Dr N. R. Tawde of the Royal Institute of Science, Bombay, and 


Prof. G. 8. Priolker of the Wilson College, Bombay, for allowing me to use their calculating 
machines. 


REFERENCES 


Bart ett, M. 8S. (1935). Proc. Camb. Phil. Soc. 31, 223. 

Geary, R. C. (1936). J.R. Statist. Soc. Suppl. 3, 178. 

Geary, R. C. (1947). Biometrika, 34, 209. 

LADERMAN, J. (1939). Ann. Math. Statist. 10, 376. 

Narr, A. N. K. (1941). Sankhya, 5, 393. 

NrEyYMAN, J. et al. (1935). J.R. Statist. Soc. Suppl. 2, 107. 

Neyman, J. & ToxarsKa, B. (1935). J. Amer. Statist. Ass. 31, 318. 
Pearson, E. 8. & ADyANTHAYA, N. K. (1929). Biometrika, 21, 259. 
PERLO, V. (1933). Biometrika, 25, 203. 

Riwer, P. R. (1929). Béometrika, 21, 124. 

Riper, P. R. (1931). Ann. Math. Statist. 2, 48. 

Rietz, H. L. (1939). Ann. Math. Statist. 10, 265. 














7 ae 


[ 431 ] 


TABLES OF SYMMETRIC FUNCTIONS—PART I 
By F. N. DAVID anv M. G. KENDALL 


1. Symmetric functions have important applications in the theory of probability and 
the theory of sampling. There are four types of symmetric function in use, and we shall also 
employ a fifth. They are as follows: 

(a) The monomial symmetric functions typified by 

(pips... pe )= Lah ah... aah... exh, (1) 


where the summation takes place over all suffixes i,j, ...,q, 7, ...,v, v which are different. The 


8 s 
number = = w, say, is called the order of the symmetric function and ¥ p;7; = w, say, 
i= j=l 


is called its weight. There are ( ) terms in the summation on the right in (1) where n is the 
w 
number of possible different suffixes. 
(6) The unitary symmetric functions (1"), denoted by a,. We have, from equation (1), 
a, = (1") = Ux,X;... Xp, (2) 
where the subscripts i,j, ..., v are all different and there are r of them in any one term of the 
sum. a, may also be defined by the identity in ¢ 


T = (t—2,)(t—-2,)...(t—2,,) = @—a,t"1+a,t"?-...; (3) 


(c) The one-part symmetric functions or power-sums, defined by 


8, = (r) = D2. (4) 
These also are a special case of (1). 
(d) The homogeneous product-sums defined by 


l 
po l+hyt+...+h,0+...; 


we shall not use these functions. They are mentioned for the sake of completeness. 
(e) The “augmented” functions [pj'... pz8] = (pt... pgs) m! ... m!. (5) 
For many statistical purposes these are more convenient than the ordinary monomials. 


2. The tables below show the product of power-sums of weight w in terms of the augmented 
symmetries and vice versa, up to and including weight 12. Each table is given the number 1 
(because further sets of tables are contemplated) followed by its weight, e.g. Table 1.5 
relates to weight 5. To express the power-sums in terms of augmented symmetrics the 
tables are read horizontally up to and including the diagonal, the unit entry in which is 
shown in heavy type. For example, from Table 1.5 

(2)? (1) = [5] + [41] + 2[32] + [271]. (6) 
The augmented symmetrics are given in terms of powér-sums by reading the tables vertically 
from the top, again up to and including the diagonal. For example, from Table 1.5 


[319] = 2(5) — 2(4) (1) — (3) (2) + (3) (1)*. (7) 


Biometrika 36 28 








432 Tables of symmetric functions 


USES OF THE TABLES 


3. A minor use of the tables may be noted: The expression of (1)’ in terms of the augmented 
symmetrics gives the expansion of moments about an arbitrary point in terms of cumulants; 
e.g. from the bottom line of Table 1.5 we have 


Us = Ky + Sky Ky + LOKg ky + 10K, kK? + L5KZK, + LOK, KE + KF. (8) 
Hs 5 aky 3X 3 


If, of course, we can accept moments about the meai we ignore any term containing a unit 
Pa; Og- Ps = Kx t+ lOK3 Ko. (9) 

4. In statistical investizations concerning sampling moments we often need the expecta- 
tions of products of power-sums. These can be written down at sight, for samples from an 
infinite population, from Table 1 when it is remembered that 


S| pit... pet] = M™ (H'p,)™ (ft p.)™ +--+ (Hp) (10) 
where n™) = n(n—1)...(n—0+1). 
Thus, from (6), 
E{(2)? (1)} = mpeg + m(m — 1) apes + 2mm — 1) py pg + n(n — 1) (m— 2) wa? 1}. (11) 


5. The same kind of procedure gives the multivariate cumulants in terms of multivariate 
moments about an arbitrary point. For example, corresponding to (6) we have 


pu'(271) = «(221) + (2%) q(1) + 2x (21) «(2) + {x(2)}2«(1). (12) 
Conversely, from (7), 
(312) = 2n'(3){’(1)}2— 2y'(31)w'(1) —p'(3) (12) + p'(312). (13) 


Equations of type (13) are particularly useful. 


6. Table 1 may also be used to express k-statistics in terms of the one-part symmetrics 
(r), namely, in terms of the power-sums which are used to calculate them in practice. The rth 
k-statistic is the symmetric function of the sample values whose expectation is equal to the 
parent cumulant of order r. Thus, corresponding to the inverse of (8), we have 





ge i (5) _ 5[41} 10[32] 20[312] 
er n(n—1) n(n—1) n(n —1)(n—2) 
30[271] 60[213] 24115] 








n(n—1)(n—2) n(n—1)(n—2)(n—3) * n(n—1)(n—2)(n—3) (n—4) "oS 


Substituting for the augmented symmetrics by power-sums written as s,, from Table 1d, 
and collecting terms, we get 


“* im {(n* + 5n®) 8, — 5(n® + 5n*) 848, — 10(n3 — n®) 8,8, + 20(n? + 2n) 88? 
+ 30(n? — n) 828, — 60ns, 8? + 24s}}. (15) 
7. The tables provide a method of evaluating the sampling cumulanis of k-statistics 
(or of any symmetric functions of the observations) by the direct evaluation of expectations. 
The alternative combinatorial method (Fisher, 1929; Kendall, 1941) is shorter in performance 
but needs careful handling in view of the ease with which certain combinations can be over- 
looked. Kendall (1941) lists most of the cumulants of k-statistics up to order 12 but does not 


give x(3°2). We will sketch the derivation of this quantity as an illustration of the use of the 
tables. 








SU 





t 


F. N. Davip ann M. G. Kenpaty 433 


By relations derived as in §6 we have 


= — vps 3885 2s} Pity: 8 
at =| Goecalen etal aaale-2 


= ! { 73 
~ ni(n— 1 (m—2)s™ % 











— n°(s3.s? + 95$.s2.s,) 


+ n5(15s§s, 8? + 278,838?) — n*(6s3 s? + 633,523.54 + 27s$83) 
+ n3(48s, 8, 5§ + 8183s?) — n?(12s,s% + 908387) + n(44s,s8%) — 8s}’}. (16) 
We take expectations of both sides. The terms on the right may be written down in terms of 
k g y 
parent j’s after the manner of § 4. These ’s are then transformed to parent x’s by the known 


relations connecting them, for example, those of § 3. The result gives us j’(3*2) and the corre- 
sponding « is then derived from the equation 


k(3°2) = y'(3°2) — 3x,K(3?2) — K_.x(3*) — 3«(3%) x(32) — 3x(32) K3—3x_KgK(32)—KGKq, (17) 


which is itself derived from the tables as described in § 5. The x’s of lower order occurring in 
this expression have already been tabulated, and we find, after carrying out the necessary 
substitutions and reductions 








K 45 9(17n — 26) 
332 as ee ~ pi - - 
wae aa? * nan — 1) “Xe t n(n—1p 83 
27(11n?—31n+4 22) 9(49n? — 134 + 103) 
2 3 K7Kq 2 3 Kgs 
n*(n —1) n*(n— 1) 


54(12n — 23) 54(63n2 — 220n + 178) _ 














- po ~ 
n(n—1)*(n—2) <7? + — aim 1p(n—2) “0X2 
54(93n? — 340n + 316) —— 54(71n*— 421n* + 842n — 564) i 
n(n — 1) (n— 2) ta n(n—1)3(n—2) 5k3 


108(41n3 — 257n? + 543n — 390) , 108(33n?—122n+110)  , 
F 2 Kak 3 >\2 K5Ke 
n(n — 1)°(n— 2) (n — 1)8 (x —2) 








648(29n? — 126n + 137) > , d24(29n? — 136n + 164) re 
' - Kg Kek5 K 
(n — 1)* (n— 2)? alates (n—1)3(n—2)? -™ 








1296n(5n — 12) 4 
. =< Ky Ko. 18 
(n—1)8(n—2)? >? (18) 








METHOD OF CONSTRUCTION OF THE TABLES 


8. That part of the tables which express the product of power-sums in terms of the 
augmented symmetrics (the part below the main diagonal of the tables) was constructed by 
building up for a given weight from the lower weights. For instance, the expression of 
(3)(2)(1) in Table 1.6 may be derived in three ways: by multiplying (3) (2) by (1); by 
multiplying (3) (1) by (2); and by multiplying (2) (1) by (3). From Table 1.3 we have 


(2) (1) = [3] + [21]. 


28-2 








434 Tables of symmetric functions 
Hence (3) (2) (1) = (3) [3] + (3) [21] 
= [6]+([3?] + [51] + [42] 4+ [321]. (19) 


li 


Again, from Table 1.4, (3) (1) = [4]+ [31]. 


Hence (2) (3) (1) = (2) [4]+ (2) [31] 


= [6] + [42] + [51] + [3] + [321], 
agreeing with (19). 


9. Kach line was checked in this manner by being calculated in two different ways by one 
of us (M.G.K.) and was independently checked by the other (F.N.D.) by symbolic operation. 
Any product of the one-part functions may be expressed as a sum of monomial symmetrics; 
for example, 


(3) (2) (1) = A(6) + B(51) + C(42) + £(3?) + (41) + (821) + (23) 
+ 1(313) + J(2#1?) + K(214) + L(1%), (20) 


where A, B, etc., are positive constants to he determined and we do not use D because we 
require it for a different purpose. In fact, using MacMahon’s D-operator technique we have, 
for example, ; 

D,(3)(2) (1) = 1 = A, 


all other terms vanishing. Similar operations with D, D,, D,D,, D3, DD, D,, etc., give 
Az B2z2C=G=1, £=3, 


and all other terms zero. Conversion into the [ ] functions then gives equation (19). 

This was the method of check actually used. [t may be noted that any table of given weight 
can be built up more speedily from those of lower weight by a similar process, e.g if we operate 
on equation (20) by D, we get 


(3) (2) = B(5)+ F(41) + H(2?) + J(21) + K(14), 


and the values for the coefficients may be read off from the table for weight 5. Operation 
by D, gives 
(3) (1) = C(4) + G(31) + H7(2?) + J (21°) + A(14), 


and the values of those coceflicients not given by weight 5 are found from the table for 
weight 4. This method was not used as a check because it only repeats the process described 
in §8 and was therefore considered to be insufticient. 


19. The complementary part of the tables, expressing the augmented symmetrics in terms 
of power-sums, was constructed by inverting the relations below the main diagonal. Consider, 
for instance, Table 1.8. Starting in the cell with row (7) (1) and column [71] we complete the 
column [71]. We then proceed to the cell in row (6) (2), column [62], and working upwards, 
complete the column [62]; and so on from column to column. A check is provided by the fact 
that the number in the top row (8) must be (— 1)?-!(p—1)!, where p is the number of parts in 
the item at the head of the column. There are various other relations between the numbers 
which act as a check on «ny doubtful figure. 





we 








19) 


yne 
on. 
cs; 


20) 


we 
ve 


> 


ite 


on 








F. N. Davip ann M. G. Kenpatu 435 


11. Again, an independent check was applied. For example, analogously to equation (20) 
we have 


(321) = A’(6) + B’(5) (1) + C'(4) (2) + #'(3)? + F’(4) (1)? + @(3) (2) (1) + A’(2)8 
+ 1'(3) (13+ J"(2)* (1)? + K'(2) (144+ L(Y). (22) 
By carrying out a series of operations, each time getting zero on the left-hand side, we obtain 
a set of equations which may be solved for the constant coefficients. This may sound formid- 


able, but in actual practice so many of the coefficients are zero and so many short cuts may 
be devised that the procedure was not found unmanageable. 


12. The present tables are complete for weights of 12 or lower. Complete tables for higher 
weights would involve a lot of additional labour and more printing space than their usefulness 
would justify. It may be noted, however, that expressions for higher weights can be obtained 
from those for lower weights, when necessary, by the symbolic method exemplified in §9. 
A similar method can be used for the converse relations. 


13. We should like to congratulate the Press on the way in which the tables are 
arranged and on the uniform excellence of their type-setting. 


REFERENCES 
FisHER, R. A. (1929). Moments and product-inoments of sampling distributions. Proc. Lond. Math. 
Soc. (2), 30, 199. 
KENDALL, M. G. (1941). The Advanced Theory of Statistics, 1. 4th edition, 1948. Charles Griffin and Co. 
MacMauon, P. A. (1915). Combinatory Analysis, 1. Cambridge University Press. 
















































































Table 1.2 Table 1.3 
| w-2 | | 4 | o-=3 | G) | tn | of 
| (2) r -1 | 3) z | —f 2 
(1}* 1 Z | (2) (1) 1 ' z -3 
| ae rae aioe (xy 1 | 3 r 
Table 1.4 Table 1.5 
] . ei [" ie 
| w~4 fl | G0 | BY | ke] oO | et | (s} | [41] i 2] | Ge) | 2) | Re] 4 
(4) I —1 —1 2 6 (5) | Z -1 —% 2 2 —6 24 
(3) (1) I 1 2 | 8 (4) (1) 1 | 5 —2 —2 6 —30 
(2)* 1 1 3 (3) (2) 1 I —1 2 5 —20 
(2)(1)* | 1 2 | 1 r 6 G) (a) | 1 2 | i z ~3 20 
(i)* . s 4 | 3 6 I (2)? (2) I I 2 ‘ r -—3 15 
ae | J ay | 1 3 4 3 3 I —10 
(1)° I 5 | 10 10 15 to Z 
aes Ss ~ a ES aes ale: 
Table 1.6 
w=6 [6] [st] [42] (41°) [3°] {321} | (31°) (2°) | (2%) | (ar) [1°] 
(6) Z -1 -1 2 —1 2 -6 2 —6 24 —120 
(5) (1) i r : -2 a; 1 6 4 —24 144 
(4) (2) 1 i -1 ° 3 -3 5 —18 90 
(4) (1)* 1 2 1 I i -3 . -1 12 —9g0 
3)* 1 . . r -1 2 > 2 -3 40 
(3) (2) (1) I 1 I 5 I I -3 . —4 20 —120 
(3) (1) I 3 3 3 I 3 i : —4 4° 
2)° 1 ° 3 ‘ . . r -1 3 —15 
gay ay I 2 3 t 2 4 I z -6 4s 
2) (1)* 1 4 7 6 4 16 4 3 6 Zz —1I5 
(1)* 1 6 15 15 10 60 20 15 45 15 z 
eee, Se ae = | = SE) Se Se 















































436 | Tables of symmetric functions 





























































































































| 
\ 
Table 1.7 
w=7 {7} | (61) | {s2}) | fsx") | f43) | (42x) | (41°) | O3%r) | C32") | (322%) (314) | (28x) | (2%) | far’) [17] 
(7) £ -1I -1 2 -1 2 -6 2 2 -6 24 -6 —120 720 
(5) G1) 1 I 2 -1 6 -1I e 4 —24 2 -1 120 —840 
(5) (2) 1 . -1 ° -1 3 ° —2 3 —12 —18 84 — 504 
(5) (1)? I 2 1 Z ° e -3 ° -1 12 ° 6 — 60 504 
(4) (3) I ° ° ° Z —% 2 —2 -1! 4 —14 3 —14 7° — 420 
(4) (2) (1) I 1 1 I Z -3 ‘ -2 12 -3 1s —go 630 
(4) Gy t 3 3 3 I 3 Z < - —! 20 —210 
(3)? (1) I I e 2 ° ° Z —2 rf 6 —40 280 
(3) (2) 1 ° 2 I ° -1 3 -3 —35 210 
(3) (2) (1)* t 2 2 I 3 2 2 I Zz -6 —6 50 —420 
(3) G1)* rf 4 6 6 5 12 4 4 3 6 Z -5 7° 
(2)° (1) I I 3 : 3 3 . 3 : z 2 15 — 105 
(2)? (1)* t 3 5 3 7 9 I 6 7 6 3 I —10 105 
(2) (1)* 1 5 I 10 15 35 10 20 25 5 1s 10 I —21 
qy if 7 21 21 35 105 35 7o 105 210 35 105 105 21 Z 
Table 1.8 

w=8(i) | [8] (71) (62) | [6r*) | [s3] | {s2x] | (51°) [4] | (431) | ([42*) | C420") | [ar] 

(8) Z —1 —3 2 —t 2 -6 —% 2 -6 24 

(7) G@) 1 I e 2 —1I 6 -1 4 —24 

(6) (2) 1 > I —1 -1I 3 -2 3 —12 

(6) (1)* 1 2 1 I ° -3 -1I 12 

(5) (3) I i ‘ r -1 2 -1 2 -8 

(5) (2) (1) 1 ' 1 é I I -3 -2 12 

(5) (1)* I 3 3 3 I 3 I . —4 

(4)* I ‘ I - -I 2 —6 

(4) (3).G@) 1 I i 1 : ‘ -2 8 

(4) (2)* I 2 I I at | 3 

(4) {3} (1)* I 2 2 I 2 2 . I 2 1 Z -6 

(4) Giy* 1 4 6 6 4 12 4 I 4 3 6 a 

(3)? (2) 1 ‘ I a 2 . ; . i. . . 

3" Gy) I 2 I 1 2 ° 2 4 4 ° 

(3) (2)? (1) 1 1 2 ‘ 3 2 I I 1 . 

(3) (2) Gy I 3 4 3 5 6 1 3 9 3 3 

(3) Gi)* 1 5 10 10 11 30 10 5 25 15 30 5 

(2)* I ‘ 4 i" 3 a 6 ° 

(2)? (1)* I 2 4 I 6 6 3 6 6 3 

£2)" (1)* 1 4 ‘ 6 12 20 4 7 28 16 1 1 

(2) (1)* 1 6 16 15 26 66 20 15 60 105 15 

(1)* 1 8 28 28 56 168 56 35 280 210 420 Jo 

tw ~ 8 (ii) (3*2) | (3*1*] | (32*1) | [321°] [31°] [2*] [2*1*} [2*1‘] {21} {1°} 

rs <a ee 

(8) 2 —6 -6 | 24 —120 —6 24 — 120 720 — 5040 

(7) (1) ‘ 4 2 | -18 120 —12 96 —720 5760 

(6) (2) -1 I 4 12 60 8 —20 84 — 480 3360 

(6) (1)? -1 ‘ 6 —60 2 —36 360 — 3360 

(s) (3) -s + 4 14 64 —12 64 —384 2688 

(5) (2) (1) e , —2 9 60 . 12 -72 504 — 4032 

{s5) (1)* . ‘ » -1 20 = A s —120 1344 

(4)* : ° 2 1 -6 30 | 3 ~6 30 — 180 1260 

(4) (3) (1) ° -4 -1I 12 7° 6 — 56 420 — 3360 

(4) (2)? ° ° —3 3 —15 -6 9 —33 180 | -—1260 

(4) (2) (1)* A . , -3 30 J -3 30 —270 2520 

(4) (1)* ° . —§ ‘ -1 30 — 420 

(3)? (2) I -1 —2 5 —20 6 —28 160 — 1120 

(3)* (1)* 1 -3 20 12 —120 1120 

(3) (2)* (x) 2 I 3 15 -6 32 —210 1680 

(3) (2) G@* 4 3 3 I —10 -8 100 — 1120 

(3) (x)* 10 10 15 10 f ° ° -6 112 

aye »? ; : I -1 3 -15 105 

I I -6 —420 

(2)* {rye 20 12 28 8 3 6 I -ts S10 

(2) (x)* 7° 60 150 80 6 1 1 r —28 

{3}. 280 280 840 560 56 van | & am 28 z 



































F. N. Davin ann M. G. KEenpat 437 

























































































Table 1.9 
w=9 (i) {9} | 82) | f72) | (7x*]} (63) |f622) | [62°] | (54) | ts32)| fs2*) |[szx*]] [519 [4*1] | [432] | (431°) | (42"x) | [422°] 
(9) Zz a | — 2]-1 -é6; -2z 2] -6 24 2 —6 -6 
(8) (1) I Z —2 —1 6 —1 4] —-24/|-1 4 2] -18 
7) (2) I Z -1 1 3 —2 3| —12 —1 i 4} -—12 
7) (1)* 1 2 I Z | =% -1I 12 -1I » 6 
6) 3} I ri -1 2 -1 2 -8 -1 2 -8 
6) (2) (1) I I 1 I zi -3 | —2 12 —2 9 
6) (1)* I 3 3 3 I 3 ra =| ‘ -4 e . ° —-1I 
5) (4) I ‘ * . gi -1| -1 2 -6] -2 ~ 4 3] —12 
5) (3) (1) I I . I r/ —2 8 —2 ° 6 
5) (2)* I 2 1 | gi-1 3 . —1 3 
} 
(5) (2) (1)* I 2 2 I 2 2 1 | 2 I Z -—6 . -3 
5) (1)* 1 4 6 6 4 12 4 1 4 3 6 Z . ° ° 
(4)? (1) I 1 ° ‘ ° 2 ° ° ° Zz ° -2 -1I 6 
4) (3) (2) I ; I . 1 1 . . . . Z —1 -2 5 
4) (3) (1)* I 2 1 I i 3 | 2 ° ° 2 I Z % -3 
4) (2)* (a) I I 2 ‘ 2 2 2} a I e I 2 = z -3 
4) (2) (1)* i. 3 4 3 4 6 1 é 6 3 3 . 3 4+ 3 3 Z 
acer I 5 10 10 10 30 10 20 15 30 5 5 10 10 15 Io 
3 1 ° oi a 3 . ot : > : ° . ° 
3)? (2) (1) i | 1 | | 3 I 2 | 2 ° ° 2 
| i | 
(3)? G@® r | as) «2: ‘we 3 3 I 6| 6 ‘ 6 6 6 ‘ 
(3) (2)* ee ae ee 1 ; 3} ; 3 | ; 3 
ae cr) <i} s| 3] s 5 4 sa 6 st & 2 7 I 2 
(3) (2) ()* r | 4 7 | 6 9 16 11 | 20 9 12 I 12 23 18 12 4 
(3) (1)* SiS] Sy as 21 60 o| 21 66 | 45. 90 15 | 30 75 75 90 60 
(2)* (1) i @ tS ie * eee 4 4 | 6 ‘ 6 | : 3 12 6 
2)? (1)? ms, $f 8] 3 10 12 1] 12 18 12/ 9 9 jo 9 18 3 
(2)* (1)® 3 §| i | 10 20 40 10 26 60 | 36) so! 5 35 100 7° 80 30 
2a) kh. oil 21 42 | 112 35 | 56) 182 126 | 231 | 35] 105 350 315 420 245 
1)° I 9 36 36 84 | 252 | 84 126 | 504 | 378 | 756 126 | 315 | 1260 | 1260] 1890 | 1260 
i | 
| 
w=g (ii) | [41°] | [3*) | 3*2x} | (3*1°) | (32) | [32%1*]) C32r*) | (30%) | (etx) | [233] | [2225] [21°] {r*] 
—— a 

(9) —120 at = 24 —6 24 | —120 720 24 | —120 720 | —5040 40320 
(8) (1) 120 | 2 —18 i —720 —6 72 = 5040 — 45360 
(7) (2) | 60 | 2 -6 6] —14 60 | —360 | —24 90 | —480 3240 | —25920 
(7) (xy* — 60 6 2 —36 360 ° —18 —2520 25920 
63 (> 4 -3 | s —14 2 “<i oe -_ = ¢ — 360 — — 20160 

2) (1) - } —1I 3 —4 3 - 420] -—3 30240 
(6) (1)* 20 -1 —120 ° 2 - Bao = ~ 
(s) (4 54 2] -12 3 | -10 54 | —324 | —12 54 | —324 2208 | —18144 
(s) (3) (2) —40 -2 12 8 —56 384 . —36 320 | —2688 24192 
(s - es -3 -_ go 12 —36 174 | —1134 9072 
(5) (2) (1)* 30 : : : 3 —2 18 — 180 . 18 —180 1764 | —18144 
(5) (1)* —5 ‘ . < | —1 30 10 —210 3024 
(4)* (1) —30 6 2 —24 180 3 ~18 150 | —1260 11340 
(4) (3) (2) 20 —2 6 -3 9 38 210 12 —S1 280 — 1890 15120 
(4) G) G@)* 20 : —6 1 24 —210 9 — 140 1470 — 15120 
(4) (2)? (1) 15 ‘ 2 12 —9go -6 27 —165 1260 — 11340 
(4) (2) (1)* —10 . -4 60 =% 5° —630 7500 
(4) (x)* Z : “a ed. #2 —756 
(3)" r I 2 2 -8 40 -6 40 —2 2240 
(3)* (2) (1) 1 | I -3 —-4 20 —120 18 —140 tr20 | —10080 
GY ay ‘ r | 3 I " 4 4° 20 — 280 3360 
(3) (2)* ° - | : r aad. 3 —15 —— Ir —50 315 —2520 
(3) (2)* (1)* 2 | 4 1 r -6 45 -9 80 —735 7560 
(3) (2) (1)* 4 16 | + 3 6 z 15 —10 175 — 2520 
ae 6 10 | 60 20 15 45 is Z . . —9 r 
2)°(1 } . 4 ‘ z —3 15 — 105 5 
(2)? (a)? 6] 18} 10 9 | 3 z —10 105 ~ue 
(2)* (1)* I 20 | 100 20 40 7° | To 15 10 r ae 378 
(2) (1)’ 21 JO | 490 140 | 210 $25 | 140 7 105 105 21 z —36 

126 | 280 | 2520 840 | 1260 ee | 1260 84 945 1260 378 36 i I 









































































































































Table 1.10- 
? — 
w = 10 (i) [10] [or] [82] [81*] [73] (721) | [71°] [64] (631) | [62*) (621*] [6x*} | 
(10) I -1 -1 2 -1I 2 -6 -1 2 2 —-6 24 \ 
(9) (1) I I -2 -1 6 -1 4 —24 
(8) (2) i : a =s —- 3 > 3 —12 
(8) (1)? I 2 I Z -3 . =e 12 
7) (3) I . . I -1 2 -1 e 2 -8 
(7) (2) (x) 1 I I : 1 Z -$ ° ° -2 12 
(7) (1) 1 3 3 3 I 3 Z : : : —4 
(6) (4) I ° . Z -1 at | 2 —6 
(6) aya?) I 1 1 1 I . -2 8 
6) (2)* 1 2 . 1 . Z =e 3 f 
{8} a 1 2 2 1 2 2 ° I 2 I z -6 
6 i 1)* I 4 6 6 4 12 4 I 4 3 6 z 
(5) (4) (1) 1 I . ‘ I ° ° 
(5) (3) (2) 1 ° 1 ° 1 . ° . 
(5) (3) (1)* 1 2 1 1 1 2 2 ° 
(5) (2)? (1) I I 2 ° 2 I ° I 
(5) (2) (1)* I 3 4 3 4 6 I 3 6 3 3 
5) (1)* 1 a 10 10 to 30 10 5 20 15 30 5 
(4)* (2) I ° I ° ° . 2 ° 
(4)* (1) 1 2 1 1 é ‘ . 2 . ‘ ° 
(4) (3)? 1 R = : 2 ; ‘ I . ‘ . 
(4) (3) (2) G1) I I 1 ° 2 1 ° 2 I ° ° 
(4) (3) (1) I 3 3 3 2 3 I 4 3 : : 
) (2 1 ° 3 . ° ° ° 4 ° 3 ° 
(4) (2)* (1)? I 2 3 u { 4 4 4 3 2 
(4) (2) (1)* I 4 7 6 16 4 8 16 9 12 I 
(4) (1)* 1 6 15 15 20 60 20 16 60 45 go 1s 
(3)? (1) 1 1 e ° 3 $ ° 3 3 ‘ ° 
(3)* (2)* 1 ‘ 2 2 I : 1 
(3)? (2) (1)* I 2 2 1 4 2 5 6 1 I } 
(3)* (1)* 1 4 6 6 6 12 4 9 12 3 6 I 
(3) (2)? (1) 1 I 3 . ¢ 3 4+ 1 3 : 
(3) (2)* (1)* I 3 5 3 oI I 10 15 7 6 
(3) (2) (1)* 1 5 iI 10 16 | i 20 45 25 40 5 
(3) (1)? I 7 21 21 36 tos | 35 42 147 105 210 35 
(2)° I ° 5 . | } 10 | 10 
(2)* (1)? I 2 5 I 8 8 | | 10 8 10 4 
(2)* (1)* I 4 9 6 16 24 | 22 4° 24 I 
(2)* (1)* 1 6 17 15 32 72 «| 20 | 46 120 76 120 1s | 
(2) (1)* I om 29 28 176 s6 | 98 36 238 448 7o | 
(1)! I 10 45 45 120 360 120 | 210 40 630 1260 210 | 5 
| 
w = 10 (ii) (s*} {s41] {s32] [531°] {s2*1} (sar) | [51°] [4*2] (4*1*] | (43°) | (432) | 
(a -13 2 —6 -6 24 —120 2 —6 2 -6 
(9) (1) al | _ 4 2 —18 120 4 2 
8) (2 -1 1 4 —12 60 -1 I 2 
(8) (1)* ° -1 ° 6 —60 ed 
-1 2 2 -8 40 —2 3 
(7) (2) (1) . -2 9 —60 1 
ay as ° . ° 3 20 ‘ ° ° ° 
4 ° -1 ° 2 I - 30 -2 4 -1 3 
(6) (3) (1) ° ° ‘ -2 . 6 —40 ; i ‘ -1 
(6) (2)* : i ; a -1 3 —15 ‘ 4 i } 
(6) (2) (1)* ° - ° 
(6) (1)* . b- 
(5)* I —1 - 2 2 -6 24 2 : 
(5) (4) (1) I Z 2 -1 6 —30 —4 —1I 
(5) (3) (2) 1 ° z -1 —2 5 —20 . -1 
(5) (3) (1)* I 2 I I a 20 : 
(5) (2)* (1) 1 I 2 ‘ I a 15 ‘ 
(5) (2) (1)* 1 3 4 3 3 I -10 a 
(5) (1)° 1 5 10 10 15 10 I . 
(4)*(a) 4 _ r -1 -1 
(4)* (1)* 2 4 . ‘ 1 r . . 
(4) (3)* ° ° ‘ ‘ I -1 
(4) (3) (2) (1) I I I : : 1 . I z 
(4) (3) (1) 3 9 3 3 ; 3 3 I 3 
( eye ( ,. > : . 3 5 ° 
4) (2)? (1 2 4 ° 2 ° I 
(4) ayy t 12 16 12 12 4 ; 6 4 12 
ye oO 36 60 60 90 60 6 15 15 10 60 
3° ° ‘ ° ° ‘ ° _ ‘ ° 
3)" (2)* 2 4 ; 
(3)* (2) (1)* 2 4 4 2 ° 2 
3)" (1)* 6 24 12 12 ‘ 12 12 4 a4 
3) (2)* (1) 3 3 9 ‘ 3 3 3 3 
(3) (2)* (1)* 5 15 19 9 9 2 9 13 21 
(3) (2) (1)* It 55 61 50 45 20 1 35 30 35 115 
yan 21 147 231 231 31s 210 21 105 105 105 525 
° ° 1s 
(2)* (1)? 6 12 24 ° 12 . : 1 12 2. 
(2)? (1)* 12 48 72 6 48 12 r * 3 48 126 
(2)* (1)* 26 156 232 180 216 100 6 135 105 160 600 
(2) (1)* 56 448 784 728 1008 616 56 455 420 560 2800 
(1) 126 1260 2520 2520 3780 2520 252 1575 1575 2100 | f 
Re Tiedt toa Wc Mince Tht Bers 





























































































































Table 1.10 (cont.) 
= : : 
| w = 10 (iii) [431°] [42"] [42"1*] [421] [4r"] f3*1) [3*2"] [3*21"] {3*r*) [32°] 
‘ (10) 24 —-6 24 —120 720 -6 —-6 24 —120 24 
‘9 (1) —18 : —12 96 —720 2 —12 96 -6 
(8) (2) —6 6 —14 60 — 360 4 -8 36 —18 
(8) Ga}? 6 2 —36 360 , 2 —36 
28, | a.-3 | ey ee 2) Ss 
7) (2) 3 - 3 : 4 - 
(7) G}* -1I ° 8 —120 4 
6) (4) —12 5 —12 54 —300 3 1 —10 -8 
(6) (3) (x) 4 = 240 —" 10 —56 2 
-3 + —I5 go 1 1 = 6 
(6) (2) (1)* ° ° -2 18 — 180 -1 6 : 
ce 1)* ; <5 . -1 30 . - | : - ‘ 
= ; -—4 —144 : . -4 - 
(s) (4) (x) 12 at 6 - 324 : ef f 4 -8 3 
(5) {3} (2) 3 : 4 —20 120 : | 6 —24 12 
(5s) (3) (x)® -3 ° > 12 —120 . ° -2 24 e 
(5) (2)* (1) ° : —2 12 —go | . = 
3} 2) (1)* -4 60 | ‘ 
5) (1)* a : 
(4)* (2) 3 =3 5 —18 9° 2 = 3 
Gy -3 —1 12 —9go ° ° . 12 e 
3G a 2 ‘ 2 -8 40 -3 -1 6 —22 3 
(3) 3} (2) (x) — : -4 20 —120 : : a | 24 =—3 
08} 03) Gy I . —4 40 = . 
I -1 3 -15 ‘ -1 
(4) (a)? Gy . 1 I —6 45 : : 
(4) (2) (1)* 4 3 6 z ~i7 . 
(4) (x)* 20 15 45 1s Z ° - | ‘ ° 
(3) (x) : . ‘ ss r é -2 8 
(3)* (2)* je . ‘ : ‘ : I -1 3 -3 
| (3)* (2) (a)* 2 I z —6 . 
(3)* (x)* 8 4 3 6 z : 
(3) (a)* (1) 1 : 3 : Z 
(3) (2)* (1)® 1 3 3 6 | > é 3 
(3) (2) (1)* | 30 | 15 30 20 25 4° s 15 
(3) Gay’ | 175 | 105 315 105 7 7° 105 210 35 105 
(2)* | Ss 10 ol » e 
(2)* (1)* | bere 10 6 . 12 ° 8 
(2)* (1)* 2 | 28 36 3 : 24 | 48 36 40 
| (2)* (1)* 140 | 120 240 45 1 120 | 220 300 30 240 
| 
(2) (1)* | 840 | 630 1680 490 28 560 | 1120 1960 280 | 168 
| \ (1)** | 4200 3150 9450 3150 210 2800 6300 12600 2100 12600 
| | | ] 
| w = 10 (iv) [32*:*] | (321°) | [31"] [2°] j2*s*) [2*1*] f2*1'] [21°] [1**} 
«7 ia ti | | 
(10) —120 | 720 —s5040 | 24 —120 | 720 — 5040 40320 — 362880 
(9) (1) 72 | —600 5040 . | —480 4320 — 40320 403200 
(8) (2) 66 — 360 2520 —30 102 — §04 3240 — 25200 226800 
(8) (1)* —18 240 — 2520 ‘ —6 144 — 1800 20160 — 226800 
(7) (a) 64 — 360 2400 . 48 | —336 — — 19200 172800 
(7) (2) G1) —42 300 —2520 . -48 | 300 - Oo 25920 i 
(7) Gy 2 —60 840 . = —24 —6720 6400 
4 48 — 300 2100 ~20 52 — 300 2100 -1 I§1200 
(6) (3) @) —42 320 —2520 ‘ —-16 | 232 —2160 20160 — 201600 
) (6) (2)* —18 90 — 630 20 —44 186 —11I0 8400 —75600 
(6) (2) (1)* 12 — 120 1260 . 8 — 120 1260 —13 1§1200 
(6) (1)* . 10 —210 . > 2 — go 1680 — 25200 
(s5)* 24 — 144 1008 ;. 24 —144 1008 — 8064 72576 
(s) (4) (1) —3o 270 — 2268 2 —24 216 — 1944 18144 — 181440 
(5) (3) (2) —40 204 1344 —48 264 ~—1728 13440 — 120960 
(5) (3) (2)* 12 140 1344 —72 ~ 10752 120960 
| (5) (a)* (1) 12 —75 630 24 —144 1044 —9go72 90720 
(5) (2) (x)* 2 30 —420 24 =g 4704 - 
| (s) (1) —1I 42 . 12 — 336 
(4)* (2) —15§ go — 630 15 —27 126 —810 6300 — 56700 
fay", (a) 3 —60 630 . 3 —36 450 — $040 56700 
4) (3)* ~20 110 — 70° ° —12 96 — 700 5600 — §0400 
(4) ) (2) (1) 27 — 190 1470 . 24 — 204 1680 ~ 15120 151200 
3 633,(1)" —8 40 — 490 . . 12 ~-280 3920 — §0400 
3 —15 105 10 14 —SI 285 —2100 18900 
(4) (3)* (i)* ~3 30 —315 . 6 54 —495 5 — 56700 
(4) (2) (1)* : -$ 105 —_ 75 —12 18900 
(4) (1)* . ‘ -7 . : ‘ —1 56 — 1260 
(3)? (1) 6 —40 280 : . —24 240 —2240 2 
(3)* (2)* y —35 210 i 12 —60 370 — 2800 25200 
(3)* (2) (1)* —6 50 —420 : . 36 —420 4480 — $0400 
(3)* Ga)* ° —§ 7° ° ‘ ° 3° — 560 8400 
(3) (2)? (1) -3 15 —105 . -8 “4 — 300 2520 — 25200 
(3) (2)* (1)* Z —10 105 . ‘ —12 160 — 1960 25200 
(3) (2) (x)* 10 z —al . — ° —12 280 — §040 
(3) (1)? 105 ar I ‘ . = -8 240 
’ ° Z —1 —1I5 105 —945 
(2)* (1)* I I -6 4s —420 4725 
(2)* (1)* 12 ° 3 6 z —15 210 —3150 
(2)* (1)* 140 12 15 4s 15 Z —28 30 
| {2} an" 1400 224 8 105 420 210 28 z —45 
| | 12600 2520 L 120 | 945 4725 3150 630 4s z 
2 ee a a a 























440 


Tables of symmetric functions 





w=11 (i) 


= 
= 
rar) 


e 














= 





(4)? G1)? 





4) (2) (1)* 
(4) (1)’ ' 


(3) (1)* 
(2)* (1) 


+ Se 


eee 


“On 
'_*N 





ee ee 


“Uae Ns 
Nt ee 
« OS me me oc 


*“MWBmne & 


ee es ee 
+ GEN Wee Ne 
- 
* MNIWWW a 
i 
NNN: + 


~ 0 Op: 


= 
RUT WW ae 


os 


*Mowws 
ee a er ee 


Te ee ee ee 
NUH PN HO 
nN 
Re UWONN: We 
wn 
UUANIOUe Nm 
we 
AOD OF NWNN 
ire on 




















SOP SP ODN HH 


ne 



























































(s)* (2) 
(5) (4) ¢ 


(3) (1) 


(2)* (1 














F. N. Davip anp M. G. Kenpat 441 


Table 1.11 (cont.) 










































































i i / 
{632} w=11 (ii) {632°} | (62*x] | [621°] | [61°] | (s*x) | (s42] | {s42"] | [53°] | (s324] | [s32*] | {s2*] | (s2*s*)| [s2x*) | (sx*] | {4%3) 
. (11) —6 —6 24 —120 2 2 —6 2 -6 24 —6 24 —120 720 2 
=g (10) (1) 4 2/| -138 120 -1 i 4 : 2 —18 —12 96 —720 . 
£9 I 4 12 60 . —1 I —6 6 —14 60 9 ° 
a {9) 1) -1I ° 6 —60 a ‘ -1I 6 2 —36 360 . 
(8) (3) 2 -8 40 . . . —2 3 -8 -8 40 | —240 —3 
(8) (2) (1) —2 9 —60 -1 3 8 —48 360 ° 
(8) (x)* . A -1 20 ‘ . . ‘ ° —1 8 —120 
(7) (4) 2 I —6 30 > -1 2 > I —6 3 -—6 30 — 180 -2 
(7 gy) —2 : 6 —40 =e 6 4 —38 240 : 
(7) (2° . ~% 3 a : —3 4 —15 90 . 
(7) (2) (x)* . —3 30 ‘ -2 18 — 180 ° 
<i (7) (x)* . : =$ : : -1 30 ‘ 
(6) (5) 2 2 -6 24 -2 -1 4 -1 3 —12 2 —10 48 — 264 ° 
I (6) (4) (x) 2 —1I 6 —30 j —2 6 ‘ 2 —24 180 ° 
: (6) (3) (2) —t —2 s —20 -1 3 4 —20 120 . 
2 (6) (3) (1)* I : —3 20 ~3 , 12 | —120 . 
4 (6) (2)* (1) I -3 15 -2 12 —90 ° 
10 (6) (2) (:)* 3 3 Z —10 ° —4 60 ° 
F} (6) ¢1)* 10 15 10 Z . -6 ' 
p] (s)®(@) : I —2 —1 6 4 —24 144 . 
‘ (5) (4) (2) : ; - ‘ ‘ r —1 ‘ -1 3 -3 5 —18 90 6 
i (5) (4) (x1)® : = 4 : 2 I Z m -3 . -1I 12 —9go > 
I 5) (3)? : : ‘ : ‘ ce .% z -1 2 ‘ 2 -8 40 . 
3 (5) (3) (2) (2) . 1 1 | - | I z <. . —-4 20 | —120 : 
, (5) (3) (1)* 3 : . 3 3 | 3 | I 3 i . —4 40 : 
4 (5) (2)° . . . : 3 | » . ° r = 3 —15 . 
16 (5) (2)* (1)* , 2 . 2 3 | I 2 4 I z -6 45 ; 
60 (5) (2) (1)* 12 12 4 | 34 6 4 16 4 3 6 z “33 . 
(5) (1)* 60 90 60 6 6 | 15 | 15 10 60 20 15 45 15 I . 
(4)* (3) . : : ‘ % : é ‘ r 
(4)? (2) (x) | 2 : : 1 
(4)? Gy | 6 6 6 ‘ . 2 1 
2 8 3 OR ‘ I ‘ 2 
2) | (4) 6) (2)* 2 . . . I 
6 (4) (3) (2) G@)* I 2 4 1 | 2 2 ; 3 
6 ) (3) Gx)" 6 12 18 18 | 4 12 4 5 
m4 | | @@PQ) 3 : ; ; 1 3 
50 (4) (2)? (1)* 6 9 2 i 6 | 14 6 6 12 ° 3 3 7 
210 (4) (2) (1)* 40 | 45 20 r | 20 | 46 40 20 | 80 20 15 30 5 . 15 
() Gy 210 315 210 21 42 | 126 126 | 70 | 420 140 105 315 105 7 35 
3 ; 
(3)? (2) | 3 ‘ ; ‘ . 
3 (3)° (x)? 3 | | 3 . : = 6 
12 (3)* (2)? (1) I 2 4: 5 4 ° : ; 2 
30 (3)* (2) (1) 3 I 6 12 | 6 | 9 12 2 . : 12 
? (3)* (1)* 30 15 10 I 30 | 60 60 | 21 60 20 ‘ 8 30 
16 (3) (2)* ‘ . a : oa 7. . . 4 : 3 
48 (3) (2)? (1)? I 6 | ori 3} | | en 9 
156 (3) (2)* (1)* 30 28 | 8 . 20 | 52 30 | 32 76 | 12 | 12 8 | 2 35 
588 (3) (2) (x)* 135 150 | 80 6 66 | 186 165 | 96 366 | 100 60 135 30 1 10S 
(3) (1)* 588 840 560 56 168 588 588 336 1848 | 616 420 1260 420 28 315 
20 | | | 
52 (2)* (1) . 10 . — " 30 . | 10 ‘ 1s 
160 | (2)* (7)* 12 30 | 4 13 | 66 18 | 36 72 | . = 18 39 
5 (2)? (1)* 1u0 120 40 | 1 | 60 | 198 120 | 120 360 | 60 7 120 15 135 
1596 (2)* (1)" 420 $32 | 280 | zt | 182 | 658 S46 | 3092 | 1624 420 330 756 135 455 
4620 | (a) | 1512 2142 | 1344 | 126 504 | 2142 2016 | 1344 7056 2184 1638 4536 1386 84 1875 
| (1) 4620 | 6930 | 4620 | 462 1386 | 6930 | 6930 | 4620 | 27720 | 9240 | 6930 | 20790 6930 462 | 5775 
ee U | u 

















442 Tables of symmetric functions 


| 
| 


Table 1.11 (cont.) 


re—__. 





































































































' 
| | | j 7 an a 
w =11 (iii) | tata (4*0*) | (43*r) | (432") | (4322")| [4324] (42*1] | [42*1*] | [421°] [417] (3*2] | {3*1*] | (3*2*1}| [3%21°) (3*1'} 5 w=) 
———$_$__j_ Float ol eet at {——_—_—_——} } 
(11) -6 24 —6 -—6 24 —120 24 —120 720 — 5040 -6 24 24 —120 720 
(10) (1) 2] -18 2 - | -12 96 —6 72 | —600 5040 - | -12 —6 72 | —600 a 
) (2) 2 —6 ‘ 4 -8 36 —18 66 — 360 2520 2 —2 —12 42 —240 | (2) 
(9) (1)* ‘ 6 2 —36 —18 240 —2520 2 . —18 240 | 25) da) 
) 1 —2 4 2 —10 40 ~6 | 40 —240 1680 6 —18 —16 Jo — 360 ie (G3) 
i) 2) (1) -1 3 ; ; 4 “= 6 —42 300 — 2520 . ‘ 4 —24 180 (8) (2) 
)(a)* ° -1 a ; ‘ ‘ 2 — 60 ° ‘ ° 2 — 60 (8) (1) 
(7) (4) 4] -12 4 3] -12 54 | —12 54 | —300 1980 - | 32 —6 48 | -300 (7) (4) 
(7) (3) (1) ° ‘ —2 . 6 —32 " —24 200 — 1680 ° 12 4 —48 320 (3) G) 
(7) (2)* ° ° ; -1I I —3 6 —18 90 — 630 2 -6 30 |) (3) (2) 
(7) (2) (1)* . ; i ‘ -1 6 P 12 ~120 1260 " 6 — 60 (7) (2) 
7) Gi)* ‘ e -1 ° 10 —210 " 10 (3) (1) 
6) (5) 2 —12 I 2 -—8 48 = 42 ~264 1848 3 —6 —10 42 — 264 (6) (5) 
(6) (4) (1) —2 12 -1I 6 —48 5 —36 270 —2100 6 1 —30 270 (6) (4) 
(6) (3) (2) . -2 3 —12 —22 120 —840 —3 3 10 —29 140 (6) (3) 
(6) (3) (x)? : —3 12 6 —80 840 =s 1S | —440 (6) (3) 
(6) (2)* (1) . : ~< 12 —75 630 . =a 3 -5 (6) (2) 
6) (2) (a1® . . ° me —2 30 —420 ° ‘ ‘ —1 10 (6) (2) 
(6) (1)* ‘ ‘ ‘ ‘ ‘ ‘ pe ° -1 42 ° : ° ° -1I (6) (1) 
(s5)* (1) ° 6 ‘ . 2 —24 . —12 120 — 1008 ° > 2 £2 120 (5)? (1 
(5) (4) (2) —2 6 : a% 5 —24 9 —33 174 — 1134 4 —18 120 (s) (4) 
(5) (4) (1)? ‘ -6 ‘ ° —1 24 ‘ 9 —120 1134 e ° 6 —120 (5) (4) 
(5) (3)* ° 5 1 — 2 -8 é —6 40 — 280 —3 6 6 —a3 104 (5) (3) 
(5) (3) (2) (1) ° ° ° s | -2 12 . 12+] —100 840 ° “ —4 18 — 120 (3) 3 
(5) (3) G1® > —4 ° 20 —280 ° <9 40 (303 
? ° —fZ 3 —ag 105 ° . (5) (2) 
(5) (2)* (1)* . =3 30 —315 : (5) (2) 
(s) (2) (1)* . =9 = : (5) (2) 
(4)? (3) —8 2 -—2 -1 4 —14 3 —14 7° — 420 6 2 —18 100 0 
(4)* (2) (1) I -3 ‘ —2 12 -3 15 —9go 630 6 —60 (4) (2 
4)' (1)? 3 Z ‘ —4 al | 20 —210 ° 20 (4) G 
(4) GY (1) : I ‘ —2 8 6 —40 280 -6 I 18 —110 (4) (3) 
(4) (3) (2)? . . z —§ 3 =3 8 —35 210 2 6 —30 (4) G3) 
(4) (3) (2) (1)" 2 2 I Z —6 —6 50 — 420 . -6 60 (4) (3) 
(4) (3) ¢ 12 4 4 3 6 Z =—— 7° —10 (4) G) 
(4) (2)® (1) 3 : 3 z =< 15 — 105 (4) (2) 
(4) (2)* (1)* 9 1 6 7 6 3 z ae 105 (4) (2) 
(4) (2) (1)* 35 10 20 25 40 5 15 10 z —2i (4) (2) 
4) Qa)’ 105 35 7° 105 210 35 105 105 21 I (4) Ga) 
(3)? (2) . ; . I -1 —2 5 —20 2 (2 
Oye ods 6 | A * 1 I | . -~3 20 fy fa 
(3)* (2)* (1) I 2 1 . : 2 - | I 3 15 (3)? (2 
3)* (2) (1)* 6 15 6 6 ‘ ; 4 3 3 I —10 (3)? (2 
3)* (1)* 60 20 45 30 60 Io ° : 10 10 15 1° rie | (3)? (1 
(3) (2)* . 6 > ‘ 7 ‘7 | ; ‘ <a (3) (2) 
(3) (2)® (x)? 6 6 12 3 2 , 6 6 P ot (3) (2) 
(3) (2)* (1)* 36 a 52 44 42 I | 12 - | 20 12; 28 8 - | 3343} 
(3) (2) (1)* z10 60 210 210 | 345 | 45 | go 60 6 7° 60 | 150 80 6 | (3) (2) 
(3) ()* 840 | 280 840 | 1050 | 2100 350 | 840 840 | 168 8 280 280 | 840 560 56 | (3) (a) 
(2)* (1) 15 . ° 30 -— 10 ° | et ° ° ‘ > so 7 (2)* (1 
(2)* (3)* 45 3 36 78 36 | 30 6 | - | ‘ 24 36 é -| (2)*(y) 
(2)* (1)* 195 30 240 300 300 | 15 140 60 | 3 | ‘ 120 60 240 60 | -1i (2)* (x 
(2)* Gy" 945 245 1120 1400 2100 245 840 560 63 | 1 560 420 | 1540 700 42 (2)? (1) 
(2) (1)* 4095 1260 5040 | 6930 | 12600 | 1890 5670 5040 882 | 36 | 2800 | 2520 | 10080 | 5880 504 | | (2) (a)! 
«1)" 17325 | S775 | 23100 | 34650 | 69300 | IISS© | 34650 | 34650 6930 330 | 15400 | 15400 | ©9300 | 46200 | 4620 | § | (ye 
a ts SS | a oS PEs ae: ares — ee SA eal an 


ag 











a 


F. N. Davip anp M. G. KENDALL 


Table 1.11 (cont.) 



























































w=11 (iv) (32°) | (32*3*) | (s2"14} [321°] (31°) {2*1] {2*1*} | {21°} [2*1"] [21°] 
| 
| | | 
(11) 24 —120 720 | —s5040 40320 | -120 | 720 — §040 40320 — 362880 
(10) (1) . 48 — 480 4320 —40320 | 24 | —360 3600 —35280 362880 
) (2) —24 78 —384 | = — 20160 120 —552 | 3360 — 25200 221760 
<9) G1) ° -6 144 | —1800 20160 ° 72 —1200 | 15120 — 181440 
) (3) —6 60 —360 | 2400 — 18480 30 — 300 2280 | —18480 166320 
(8) (2) (1) ° —36 264 | —2160 20160 | —30 306 —2520 | 22680 — 226800 
tes ° “ —24 480 — 6720 . —6 240 | — 4200 60480 
(7) (4) —12 42 —276 1980 —15840 | 60 — 288 1980 —1584e 142560 
(7) (3) @) ° —24 256 —2160 19200 . 144 — 1680 16800 — 172800 
(7) (2¥ 12 —24 102 — 630 5040 —60 216 | -1170 | 8280 — 71280 
(7) (2) G)* ° 6 —84 900 -10080 | . | —72 900 | — 10080 116640 
(7) Ga) - ; 2 —go 1680 | : a —30 | 840 —1§120 
) (s) — 46 264 | 1848 — 14784 | 40 —264 | 1848 — 14784 133056 
(6) (4) (2) ° —16 192 | —1800 16800 | —20 | 156 | —1500 14700 — 151200 
(6) (3) i} 8 —44 212 | —1320 10080 | —40 256 | —1660 | 12600 ~ 110880 
(6) (3) (1)* > 2 —84 | 960 —10080 | ‘ —24 | 580 | -7560 90720 
(6) (2)* (2) 12 —72 540 —§040 | 20 —132 | 930 —777° 75600 
(6) (2) Gy’ . 16 | —240 3360 ° 3 —200 | 2940 — 40320 
(6) (1)* " - | 12 — 336 . 2 | —126 3024 
(5)? (1) —12 96 | —864 8064 72 —720 | 7056 — 72576 
| 
(5) (4) (2) 12 —33 168 | — 1134 9072 | — 60 252 —1i1512 | 11340 — 90792 
(5) (4) (x)? : 3 ~60 810 — 9072 . et 540 — 6804 81648 
(3)* } . =< 104 | —664 4928 72 — 600 4928 — 44352 
$5} 633 (2) (1) | ° 24 —160 | 1224 — 10752 — 144 1320 — 12096 120960 
(s)(3) ‘ > 16 | — 280 3584 . " —120 2240 — 32256 
(5) (2)° —-4 5 —18 105 —840 20 — 60 204 | — 1974 16632 
(5) (2)* Gy? . =g 24 —225 2520 : 36 —360 | 3654 — 40824 
(5) (2) (1)* ° . —2 45 —840 ° ° 30 —630 10584 
(5) (x)* : . ; —% 56 . . 14 | ~ §04 
(4)? (3) 3 —12 86 — 600 4620 —15 78 -| — 570 4620 —41580 
| | 
(4)* (2) (1) . 6 —60 | 540 — 5040 15 81 630 —5670 | 56700 
(4) Gy . . 4 | —120 1680 ° 3 — 60 1050 — 15120 
(4) (3)? () : 6 —80 | 660 - 5600 : —36 480 — 4909 S0400 
(4) (3) (2)? -6 15 —6s | 390 — 2940 30 —120 | 075 — 4830 41580 
(4) (3) (2) G:)* : —% 54 | —$70 5880 . 36 | —s10 S880 - ) 
(4) (3) Gx)* ° -% 60 — 980 ee . 15 — 490 8820 
(4) (2)? (1) -2 12 —9go 840 oe 7 42 | 255 1995 — 18900 
(4) (2)? G1)® . —4 60 — 840 . —6 90 — 1155 15120 
(4) (2) (a) : - —~6 168 ; -3 105 — 2268 
(4) Gy’ . -8 ° —% 72 
(3) (2) ° 6 —28 160 —1120 ° | —24 180 —1400 | 12320 
GP Gay . . 12 —120 1120 An . — 60 840 — 10080 
} (3)? (2)? (1) . -~6 32 | —210 1680 ° 36 — 300 2590 — 25200 
} (3)? (2) GQ . . -8 | 100 —1120 os | 60 —g8o | 13440 
} (3)? G) . ‘ ~_- -6 112 x 4 | ‘ 42 | — 1008 
(3) (2)* r -1 3 —15 105 -5 14 —65 420 | —3465 
(3) (2 (1)* I Z -6 | 45 — 420 ° —12 110 —1050 | 11340 
(3) (2)* (1)* 3 6 & J —15 210 | ° . —15 | 280 | — 4410 
(3) (2) (1)* 15 45 is | I -28 | | 7 —14 420 
(3) (1)° 105 420 210 | 28 z | | = ‘ ~9 
(2)* (x) 5 : - | . e 7 -3 | 15 | — 105 945 
(2) (ay? 13 12 ea ; 3 | zr | -10 | 105 — 1260 
(2)* (1)* 55 100 1s | ; 15 10 x | —ar | 378 
} (2G) 315 840 245 | 14 105 105 | 21 A — 36 
| (2)()* 2205 7560 3150 | 336 9 O45 | 1260 378 36 | I 
17325 69300 | 34650 4620 165 10395 | 17325 6930 99° | 5S 





























444 Tables of symmetric functions 


| 
| 











































































































Table 1.12 
: ne ] ) 
w = 12 (i) (12) | (11, 1} | (20, 2) | [10, 1*}} (93) | [022] | for} | [84] (831] | [82%] | (82x*}} (8x) | (75) | C742) w=1 
12) Z ~ - ~ oa = 
aby 1 F : ; : ae ‘ - . = ., == a 12) 
10) (2) I ; I ai —7 3 aa ; a I 1) (y 
10) (1)* I 2 I I ~% an a j 10) (2 
9) (3) I : I ate 2 = : > -8 10) (1 
9) (2) (1) I I I I I -3 ae a 9) (3) 
y 1) I 3 3 3 I 3 7 =i 5 9) (2), 
) (4) I r bade =e “ = > 9) (1) 
(8) (3) (2) 1 i : : ; a : : 8) (4) 
2 
8) (2) (1)* ; : ‘ Be . = @) 
2 AS 1 2 2 I 2 2 1 2 I Z -6 * ‘ | 
Bis) Poe hs Me Se eS ees ee Oe ee SD oe Soe ee. | Bi. 
7) (4) (1) I I ; ; : 4 . —" 7) (s) 
7) (3) (2) I I : I : : z 7) (4) 
7) (3) Ga)? I 2 1 I I Pa a : s : 7) (3) 
Bay :7 Sta s 4 om 2G) 
. I 3 + 3 4 6 I 3 6 3 3 | ) i 
s d I 
ta 1) : 5 10 *» . 30 10 5 20 15 30 5 | I | Oc 
(6) (5) (1) r 1 ‘ ; 
8} (4) (2) I ; : ; ‘ I : ; | x | 6) (s) 
6) (4) (1)? 1 2 I I ‘ : é i ‘ ° * 6 { 
3)* I " ‘ 2 . . 2 2 6) (4, 
ee ae lal cp dle ide] api : 7 Hh 
3G 3 ‘ 3 
ee ee a de ee be Bs Oe > ad ae ae 30) 
2 I 1 2 3 1 if s 4 ° . 2 
6) a) ci 24: 6) £2 Bi 2s. 22 ee a) 2 ;| 3| sti 
I 6 15 15 20 60 20 15 60 45 90 15 6 30 6) 1)" 
(5)? (2) I I ‘ 
(5)? (x)? I 2 1 I : : : : 2 (5)* (2) 
5) (4) (3) I ~ 4 - a I : : : 2 (s)* (1) 
5) (4) (2) (1) I I I : I : ; . I (5) (4) 
) 1s} Gi)? I 3 3 3 : 3 : 3 : 2 I 45) (4) 
5) (3)* (1) I i at « 2 : 4 3 5) (4) 
aay | f soe : ; : I 3 : (SG) 
. 2)(1 1 2 2 1 3 2 , 
(s)(3)(r)® |g 4 6 | 6 s 3 ; t : z 3 2 (5) (3) 
(5) (2)* (1) ¢ s 3 6 1 7 12 (5) (3) 
(5) (z)* (x)* : | . . ° 3 5 4 3 (5) (2)° 
: I 3 5 
oe tt | fh] ce] St 8] tl] et et 8] 3g] 8] gl] sl a {3 a 
Sf, | . 4 = | 2r 35 105 35 35 140 10s 210 35 | 22 105 (5) (1) 
4)* (3) (1) I I ’ I : : 3 : : (4)° 
(4)* (2)* . s | . 3 : 2 2 (4)? (3) 
meer fit al ga] a] 3 Pik tt 5] gl 1) ee 
4 j 
hetsall a ee ee ee et eee 
4) GG : . | ‘ 4 
(4) (3) (28) | 4 4 if 4 Se 4 | i | @) G) 
4) 43) (27° I | | 
wage) tal gl a] gl al al al alalal -l al al blag 
4? aye 1 | 3 8 . a0 a8 30 | 10 II 25 15 30 5 16 30 | @)(), 
4) (2)? (1)? I 2 4 3 he | Se é ; eh (4) (2)° 
ayarcay | x 4 8 6 an ak és - 8 6 a : : 8 (4) (2)° 
a) gaat) | ns 6 16 15 26 66 | 20 31 go 6¢ 105 35 $a 36 3 st 
Gy | > | % | 2 | 36 | 168 | 56 | 7 | af | aro | 420 70 | 64 | 288 Gc) 
° e ° ‘ Pr y zs * 3)" 
ony I I I 4 : 3 3 j : 0)" (a) 
(3)? (2)® i 4 : . : . . 9 9 : . 9 18 (3)? (1) 
(398 (2)? (1)* 1 2 3 ; é : ; 3 ; 3 6 , (3)? (2) 
a3}, 3 (2) - ¢ 7 6 10 16 4 15 24 ° 13 r | 8 6 } Gy G3) 
Berm | rt rte] S) a] 8] t) a] a] ae] ee] se) ae) Dl oem 
3) (2) (1 ; 2 3 6 1 : 
Oaar | t | s/ ze] we] a! wl] wo] e] lB] 8 | 38) | | Beh 
ae : | 3 36 i | 2 63 | 189 | 126 | 231 35 | an (3 £2) | 
a | 9 3 36 85 252 &4 135 513 378 | 756 126 162 702 (3-«4) 
2 1 ” | 6 
2)* (1)? } 15 : 1s | , (2)* 
Bans i} eg] al] §] es] ge] a] #1 8] 8] g] oc] 2) 2! 2] Re 
i St I 6 18 15 } 38 78 20 63 150 93 135 I <° 3 | (2) (1) 
3} sie : | 8 30 28 72 184 56 127 392 267 476 . I ¢ (2)* (1) 
(aye : “b rr 3 | 130 37° 120 255 930 675 1305 210 Pin sles i (1) 
a | | | = 660 220 495 1980 1485 | 2970 495 792 | 3960 (1)! 
j UL 






































*“Qnmn- 


i2 
30 











| 
| 





F. N. Davin anp M. G. KEnDALL 


Table 1.12 (cont.) 













































































aU me | — 
w= 12 (ii) [732] 7) | [72"1) | [721°] (71°) | (6%) {651} {642} | [642*] | [63°] | [6321] | [632%] | [62*] | (62*1*] | 
' ] 
(12) 2 —6 -6 24 — 120 -1 2 2 --6 2 —6 | —6 24 
(11) (1) ‘ 4 2 —18 120 4 -1 . 4 . 2 -1 x —12 
10) (2) 2 I 4 —12 60 -2: | I ° 2 -6 6 —14 
10) (1)* a -1 _ 6 — 60 - -1 ‘ . 6 . 2 
9) (3) -% 2 2 —8 40 . -2 3 ~8 . -8 
9) (2) (1) » e -2 9 —60 ‘i . -! 3 é 8 
(9) (1)* ° ° —1I 20 . * -1I ° > 
(8) (4) 2 1 —6 30 -1 2 1 -6 3 -6 
(8) (3) (x) <2 : 6 —40 : * 6 : 4 
(8) (2)* : +s 3 —15 : =2 4 
| (8) (2) (x)* | ‘ ° -3 30 2 - . . . ‘ -2 
| (1)* ‘ ° . -5§ s . . < ‘ . . 
7) (s) -1 | 2 2 -6 24 2 -1 2 x -6 : —4 
7) (4) (1) ie —2 -1! 6 —30 . . -2 > 6 . 2 
7) (3) (2) | x | -1 “2 5 5 | —20 ° ° . -# 3 ° 4 
9) (3) (1)* | r | I o a. a a a M : 4 -3 = . 
7) (2° (1) | = | _ -.4 -3 1s | 2 ae : . j < . -2 
paar | «| 3 3] 2 | 10 ‘ o : 
7) Gy | 10 10 15 } 10 I 3 ‘ ;: ae k = r 
(6)? i a " ; ; ‘ I -1 -1 2 ~x | 2 -6 2 -6 
6) (5) (1) - 7 ° , | . 1 Z —2 . -1 6 . 4 
6) (4) (2) . | ° | ° - I e —1 » —b } 3 —3 5 
6) (4) (1)? | = ig ‘ ‘ < 1 2 I z ‘ = io. : -1I 
6) (3)? | > aun ‘ 7 1 : £ ‘ Z -1 2 < 2 
6) (3) (2)(1)_ | 1 | ° ° } 3 | ° I 1 1 . 1 Zz -3 ° —4 
6) (3) (1)* } 3 3 o% ° } ° I 3 3 3 x a r : ‘ 
6) (2)? | ; “ , ; 3 I % 3 : ; ; 2 z -1 
6) (2)? (1)" 4 | S$ 7 ; : I 2 3 I 2 4 ‘ I z 
6) (2) (x)* 16 | 12 12 4 . 1 4 7 6 4 16 4 3 6 
6) (1)* | 60 | 60 go 60 6 I 6 15 Is 10 60 20 1s 45 
| (5)* (2) | al . . . ‘ ° ° . . 
s)* (1)? | e 4 " ° 2 4 . . . . . ° 
5) (4) (3) . ‘ . : ‘ N : 3 : ‘ . ; 
5) (4) (2)(2) | ° ° ° 1 I 1 . . ° ° ° . 
5) (4) ()* - | : : 3 9 3 3 : ° . : ° 
5) (3)* (x) . ee ‘ 1 I . ‘ 1 . . ° 3 
(3) (2)* 2 | . ° 4 ; u ‘ a . ° ° : 
5) (3) (2) (a)® 2 | 1 ° 2 4 2 . 2 2 ° . ° 
5) (3) (1) 6 | 6 . 4 16 12 12 4 12 a ‘ ‘ 
5) (2)* (1) } 6 | : 3 I I 3 ° ° ° ° 1 ° 
(s)(2¥ Gay | 14 | 6 9 2 . 3 9 9 3 6 12 . 3 3 
(5) (2) (1)* | 5° | 40 45 20 I s 25 35 30 20 80 20 15 30 
Al 210 | 210 315 210 2" 7 49 105 105 70° 420 140 105 315 
(4)? (3) (2) 6 mi : , ; . : : 
{4)* (2)* — . 2 ‘ 4 ° ° 
(4)? (2) G7? - | . 2 4 4 2 a 
(4) Ga)* ° ; 6 24 12 12 . 
(4) (3)? (2) 2 . I : I 2 1 
(4) (3)? (1)? 2 2 1 | 2 I 1 I 
(4) (3) (2)* (2) | 4 | , 1 ‘ aa 2 2 4 2 2 , . 
(4) (3) (2) G2)? | 8 6 | 3 I - | 4 12 10 6 4 6 I . 
Waar | 2] 2| 3: ef Vw 8 10 50 40 40 10 30 10 ; 
| (4) (2) — - a : +, Ge 4 ‘ 16 | 2 . , ; 4 x 
(4) (2)° fy 12 . 6 . . 4 8 16 | 4 6 12 ° 4 3 
(4) (2)* (x)* 40 24 | 28 8 : 8 32 40 24 20 56 8 12 1 
(4) (2) (1)* 140 120 | 150 80 6 16 96 | 136 120 rs) 300 80 60 135 
‘4 (1)* 560 | 560 840 560 56 28 224 448 | 448 abo 1680 560 420 1260 
3) . . ° ‘ ; 3 a ° ‘ 6 ‘ . ° ° 
3)" (2) (@) 3 3 3 3 | 6 a 
| 
(3 Ga)? 9 9 ‘ . 3 9 9 | 9 6 9 3 " . 
(3)? (2)? 6 7 ‘ . ‘ 1 ; 3 ° 1 ° x 1 . 
te (2)* (1)* 10 2 2 ‘ e 5 10 11 1 1 12 é I I 
(3)® (2) (1)* 30 24 12 4 “ 9 36 39 30 21 48 12 3 6 
(3)* (1)* go 90 90 60 6 21 126 135 135 SI 180 15 45 
(3) (2)* (1) 16 . 6 ‘ i 4 4 16 ° 4 | 4 ‘ 4 . 
(3) (2)® x)" 36 12 18 3 - 10 30 42 | 12 28 48 1 10 9 
(3) (2)* (1)* 112 80 80 | 30 1 20 100 140 | 100 80 240 5° 40 7° 
(.) £2) (a) 372 336 420 |. 245 21 42 294 462 | 420 252 | 1092 315 210 S25 
(3-«4)* 1296 1296 1890 | 1260 126 84 750 1512 1512 924 | 5292 1764 i260 3780 
| (2)* " ‘. «7 ‘ ‘ 10 ‘ 60 | ; . 4 ‘ 20 ‘ 
| tye fie | 40 " 20 . “ 10 20 60 | 10 20 | 40 ° 20 10 
| (2)*(1)* | 4152 48 2 16 ‘ 22 88 148 60 88 | 208 16 52 60 
| (2)? (1)* | 336 | 240 288 120 6 46 276 408 330 280 | 960 200 196 360 
| (2)*(1)* | 1024 | 896 1184 672 56 98 784 1484 1288 806 4032 1120 868 2128 
we | 3000 | 2880 4140 2640 252 210 2100 4620 4410 2940 | 15960 age 3780 | 10710 
(a | 7920 7920 | 11880 7920 792 462 5544 — | 13860 9240 | 55440 183 jo 13860 | 41580 
a 1 i ae. 2 eee i 

















446 


Tables of symmetric functions 


Table 1.12 (cont.) 










































































w= 12 (iii) (621‘] [61°) {s*2] | (s*x*) | {543] | [sa2x] | (542°) | [s3*x] | (532*] | (s321°}| (s3x*) | (s2*2} | (s2*x*] | (szx'] 
(12) —120 720 2 —6 2 —-6 -6 —6 24 —120 —120 720 
gi3} (2) 96 —720 : 4 . 2 a | 2 Y —12 96 me 72 — 600 
10) (2 60 — 360 -1 I E 2 -6 : 4 -8 36 —18 66 — 360 
(10) (1)* —36 360 ‘ -1 é é 6 : ’ 2 —36 ; —18 240} 
(9) (3) 40 — 240 -1 I -2 4 2 —10 40 -6 40 | —240 
(9) (2) (1) —48 360 ‘ -1 3 . . 4 —24 6 —42 300 
$9) 61 8 —120 . ‘ -1 . . . 8 : 2 —60 
(8) (4) 30 — 180 -1 2 -6 2 I -6 30 -6 30 — 180 
(8) (3) (1) —32 240 > : ‘ -—2 ; 6 —32 ‘ —24 200 
(8) (2)* “Se 90 : -1 I -3 6 —18 go || 
(8) a) (x)? 18 — 180 : ° . ° ° ° ° -1 6 ° 12 — 120 
tb es} —1 30 ‘ ° > . . ° ° “3 ° : 10 
7(s 24 -144 —2 4 -I 3 33 2 4 —10 4 12 4 — 26 
(7) (4) (1) —24 180 ° ° . -1 6 . M 2 —24 3 —18 cm 
(7) (3) (2) —20 120 a : ; ‘ -2 3 —12 6 ~22 120 
(7) @) Gy 12 —120 . . ° rl . —1 12 . 6 —80 
(7) as} 12 —9go ° ° . ° ° . -3 12 -—75 
(7) (2) G? -4 60 ‘ : . 6 : . : —2 30 
(7) Gx)* . =@ ; . " . > ° : . =1 
(6)* 24 —120 2 1 —6 I —4 24 —2 18 — 120 
(6) (5) (1) —24 144 —4 I 12 1 6 — 48 2 30 240 
(6) (4) (2) —18 go | -1 3 : 2 2 3 15 90 
(6) (4) (1)* 12 —9go ‘ -3 . 12 ; 3 —60 
6) (3)* } -—8 40 ‘ —1 2 -8 ; —6 40 
HOES oe A | 20 —120 . -2 12 . 12 — 100 | 
(6) (3) (1)* -4 40 . —4 . ‘ 20 
(6) (2)? j 3 —15 | | ° | 3 —15 
(6) (2)* (1)? -6 45 } ' —3 30 
(€) (2) (1)* r —15 | | ; -5 
(6) (1)* 15 I | : 
| | 

(5)* (2) r | eu -1 3 . ~2 3 —12 6 —18 84 || 
(5)* (1)* ° 1 I ‘ ‘ -3 ; 5 -1 12 s 6 - $0 
(5) (4) ¢ e . a r | —! 2 -2 -1 4 —14 3 —14 70 
(5) (4) (2) (1) ° 1 : 1 4 —3 ‘ ‘ -2 12 -3 15 —go 
(5) (4) ¢ 3 3 I 3 I . ° ° —4 ‘ -1 20 
(s) (3)* G) 2 . ‘ r i -2 8 ; 6 — 40 
(5) (3) (2)* 2 : I i . : r -1 3 -3 8 —35 
(5) (3) (2) (1)* 2 I 3 2 : 2 I I -6 ; -6 50 
5) (3) (1)* 6 6 5 12 4 4 3 6 r 5 . -5 | 
(5) (2)? (1) 3 . 3 3 ; _ 3 ’ r -3 15 | 
(5) (2)* (1)* . . 5 3 7 9 I 6 Zz 6 : 3 r —10 
(5) (2) (1)* 5 : I 10 15 35 10 20 25 | 40 5 15 10 r| 
t5}e 1)’ 105 7 21 21 35 105 35 7° 105 | 210 35 - 105 21 | 
4), (3) G) . ‘ 2 ‘ ‘ " | 
(4)? (2)* ° ° . ° . ° } 

4)* (2) (a)? 2 : 4 4 ‘ ‘ 

4° a) 12 12 8 24 8 . 

4) (3)? (2) . : 2 ‘ ; : 
(4) (3)? G)* 6 2 
(4) (3) (2)* (1) . ° 2 ° 4+ 2 > » I ‘ . . . ‘ 
(4) (3) (2) G1)? ° ‘ 6 3 14 12 I 6 3 3 J : 4 : 
4) (3) (x) ° ‘ 30 30 36 90 30 20 rs | 30 5 ; d < 
(4) (2)* ‘ ‘ . 7 ‘ 4 ‘ - . ‘ R . 
(4) (2)* (1)" A ° 6 ‘ 12 12 ‘ “ 6 ° e 2 ° 
(4) (2) (1)* 2 : 20 12 40 56 8 24 28 24 ‘ 12 4 x 
(4) (2) (:)* 40 I 66 60 116 276 80 120 150 240 ° oO 60 6 
sc) 420 28 168 168 336 1008 336 560 840 | 1680 die 840 840 168 
(3)* (2) (1) 6 3 
(3)? Gy? . ° ‘ P 18 ‘ ; 9 . . . _ . ¢ 
(3)* (2)* ; , 6 - | 6 ji . 6 : . . : i 
(3)* (2)* (1)* ; , 6 2 18 s . 10 6 4 » m ‘ ‘ 
(3)* (2) (1)* I ; 18 12 54 48 | 8 36 18 24 2 P . : 
(3)* (1) 15 i 90 go | 162 360 | 120 126 90 180 30 . . . 
(3) (2) G1) . a 12 ; 18 12 | > ° 18 . . 4 | ‘ : 
(3) (2)® (1)* R 7 24 | 9 60 | 54 | 3 36 42 27 " 12 | 3 : 
(3) (2)* (1)* 10 ‘ 72 | 50 186 | 260 50 160 156 190 15 60 | 30 2 
(3) (2) (1)? 140 o.1 .e 231 588 1302 385 672 756 1281 175 420 | 315 42 
(3) G1) 1260 84 | 756 | 756 | 1890 5292 1764 3024 4158 8316 1336 3780 | 3780 756 
(2)* ; . ‘ . a 4 » : , : : | : F 
(z)* (1)* . ea 30 | 60 60 | ‘ ° 60 | ° 20 | . ° 
(2)* (1)* 4 a 84 6 | 216 264 | 24 144 216 | 144 ° 88 24 . 
(2)* (1)8 60 | I 258 | ibe | 708 1188 240 720 888 1080 | 90 456 | 240 18 
(2) (1) 560 28 840 728 | 2352 5264 1456 3136 4032 6496 | 840 2688 | 2016 280 
(2) (1)" 3360 210 2646 2520 7980 | 21420 | 6720 | 13440 | 18900 | 35280 | 5460 | 16380 | 15120 2772 
1)* 13860 | 924 8316 8316 | 27720 | 83160 — 55440 | 83160 ere | 27720 | 83160 | 83160 16632 





LS TT 





——$$$__— 


w=12 (i 





(6) (5) (1 
(6) (4) (2 





ow 
Ss se 


~~ 


—— 


NNN NN 


te 








| F. N. Davip anp M, G. KENDALL 447 


Table 1.12 (cont.) 




























































































[s21'] w =12 (iv) [s1*] | {"] 4°31] | [4%2*}] | [4*21") | [4"1‘] (43*2] | (43°1*) | [432*2] | [4322°] | [431°] [42°]. | [42"1"] 
— ‘va 
t 
720 | (12) — §040 2 -6 —6 24 —120 -6 24 24 —120 720 24 —120 
— 600 || (11) (1) 5240 . 2 ° —12 " —12 —-6 72 - ‘ 48 
— 360 (10) (2) 2520 . : 4 -8 36 2 a —12 42 —240 —24 78 
240) | (xo) (1)* —2520 ‘ ‘ 4 2 —36 . 2 : -18 240 ; —6 
— 240 (9) (3 1680 : 2 ° —4 16 4 —12 -10 46 - J 36 
300 § | (9) (2) () — 2520 . . . 4 —24 . : + —24 t . —36 
—60 |f | (9) (1)* 340 . 8 2 —60 
—180 | | (8) ts 1260 =% 5 5 —14 54 2 - 4 —10 54 — 300 --18 $4 
200 (8) (3) (x) — 1680 —1 2 - 2 —30 200 12 
90 | (8)(2 —630 —t I -3 2 -6 30 12 —24 
— 120 (8) (2) (x)? | 1260 : -1 6 ° 6 - 60 6 
10 (8) (1)* —210 . —1 ° 10 
~ 264 i} ts} 1728 2 -8 2 -8 -8 42 —264 36 
150 7 14) (3 | — 1260 —2 8 —48 8 3 —36 . —24 
120 (7) (3) (2) —840 : -2 2 -17 24 
—80 (7) (3) «1)* 840 -2 9 —80 > 
—75 |} 2 §2 630 ; -1 —15 12 
30 7) (2) Gy —420 . —1 10 
—: (7) (1) 42 ° —I ° 
—120 (6)? 840 2 —4 24 I -2 -4 18 —120 -8 22 
24¢ (6) (5) (1) | —1848 “ie . . 4 —48 2 2 —24 : —16 
9c (6) (4) (2) | —630 . : -4 6 —24 -1 I 6 —21 120 20 —41 
— 60 ‘8 ts} {n° 630 -2 24 -1 g —120 ; 5 
4¢ | 280 -1 2 2 ~ 40 - 
— 100 (6) G) (2)(1) | 840 —3 9 ~ 12 
20 (6) (3) (1)? | —280 . . -1 20 
—a§ (6) (2)* 105 . ° —4 s 
30 | 4 (6)(2)* (x)? | = 34S . ° . -3 
—s if (6) (2) (1)* 105 > ° . . 
(6) (1)* =—7 : ° ° | . 
84 )* (2) — §04 2 —12 . ° 2 -9 60 ° —1i2 
$0 (5)? G2)? 504 ° 12 : —60 : 
70 ) (3) —420 —2 4 —16 -2 8 S| —26 134 | 4 —18 
—90 tS) ‘4 A 630 -4 24 i -2 Is —120 8 
20 (s) (4) (1) 210 -8 =§ 4° (| 
—40 (5) yay) 280 . . ‘ —2 : 6 —40 ‘ . 
—35 | (5) (3) (2)* 210 . . ° 3 a : 2 -1 3 —15 5 6 
50 (5) (3) (2) (1)? —420 . : . . . : . ° -3 30 ‘ ‘. 
= (5) (3). (x)* 7° : . . . ° . . . ° -§ . 
15 | (s) (2)* (x) - 105 . . . ° ° . . . . . -2 
-10 (s) (2)? (1)* 105 : . . * : 
r (5) (2) (1)* —21 : : . ° . 
21 eshat)” Z . : . x ‘ - 
. ° Z -= -1I 2 -6 2 I —-6 30 -6 
(ay f3).(r) 1 r -2 8 —4 -1 1a —7o 6 
(4) I ° r —1 3 | . a7 3 —15 -6 9 
tf): £3) (a)® I 2 I I -6 . -3 30 -3 
(4)* Gr) I 4 3 ee : | —-5 . 
(4) (3)? (2) . . . - | r -1 —2 | 5 —20 6 
(4) (3)* G@)* 2 4 . ° I r | $3 20 ° 
| Gap (a) (aya) | ; 1 1 2 sh i <~s as i 
| (4) (3) (2) (x)* | 3 9 3 3 4 3 3 r -10 : i 
| (4)(3) G)* | 5 25 15 30 5 10 10 15 10 r . 
(4) (2)* 3 6 . é . i -1 
(4) (2)? (1)? 3 6 6 3 6 6 . I r 
(4) (2)* (1)* 7 28 16 18 I 20 12 28 8 3 6 
6 | (4) (2) (x)* . 15 > 60 105 15 ‘0 60 150 80 6 | 15 4s 
168 | a 8 35 280 210 420 70 abe 280 840 560 s6 | 105 420 
} (3)* (a) (2) 3 ; | 
| (3)? (1)? " 6 18 9 9 Z | : : 
| (3) (2)? . ° . 3 . | . ° 
| (3)? (2)? (1)* 2 4 2 : ‘ 11 I 4 | . ‘ 
? (3) (2) (1)* . 12 12 12 ; 39 jo 24 8 . 
3)* (x) . 30 180 go 180 30 135 135 180 | 120 12 ° 
(3) (2)* (x) . 3 3 6 : = 12 : 6 | ‘ I 
(3) (2)? (1)? 9 27 18 48 | 9 36 | 3 3 3 
2 (3) (2)* (1)* : 35 175 80 90 5 200 | 130 220 | 72 | r | 15 30 
42 8} 2) (1)’ I 105 Bs 420 735 105 840 | 735 1470 805 63 | 105 315 
756 3) (x)* 36 315 2835 1890 ° 630 3780 3780 9450 | 6300 630 | 945 3780 
(2)* ° 1s ° 45 ‘ . ‘ é wl . . 15 é 
RAD : 1s 30 45 15 ‘ 60 E do | . . | 1s 10 
{2)* (x) : 39 156 123 90 3 264 72 12 | 48 . 60 
18 | RH ng . 135 810 495 s85 45 | 1200 720 1800 | 600 18 195 420 
280 | 2)* (1)* 8 455 3640 2345 3780 | 490 | 5600 Ro uae 5600 3 2 1msS — 
2772 | 2) (1) a 120 1575 15750 11025 20475 | 3150 | 27309 200 69300 | 42000 787 28350 
16632 | (1) | 792 | $775 | 69300 | s1975 | 103950 | 17325 | 138000 | 138600 | 415 2¥7200 | 277200 St97S_ | 207900 
i 





Biometrika 36 29 





448 


Tables of symmetric functions 


Table 1.12 (cont.) 





























w= 12 (v) {42"1*] | [421°] [41°} [3*] [3°21] | [3°1*] | (3*2*] | [3*2*1*] | [3*21*) | [3*x*} (32*r] | [32*2*] | [32°14 
12) 720 — 5040 40320 —6 24 —120 24 --120 720 — 5040 — 120 720 — 5040 ! 
ii) fr} — 480 4320 — 40320 . —6 72 48 — 480 4320 24 —360 = | 
10) (2) —384 = — 20160 : —6 18 —18 54 — 264 1800 96 — 432 2640 
zor (1)? 144 — 1800 20160 . . —18 4 —6 144 — 1800 > 72 — 1200 } 
9) {3} —240 1680 — 13440 8 —20 76 -ie 72 — 384 2400 48 —348 2400 
(9) (2) (1) 264 —2160 20160 ‘ 2 -6 ‘ —24 168 — 1440 —24 234 — 1920 
(9) (1)* —24 480 —6720 z 2 é —24 480 % —6 240 7 
(8) (4) —300 1980 —15120 -6 54 -—6 38 —276 1980 42 —252 1860 
3 3) (1) 160 — 1440 13440 6 — 54 ‘ —32 280 —2160 —-6 180 — 1800 
8) (2)* 102 — 630 5040 ‘i . 6 —10 42 —270 —36 126 — 690 
8) (2) (1)? —84 goo — 10080 ° ° i ‘ 4 —48 540 ° — 54 660 
‘) f2), 2 —90 1680 a . js ‘* : 2 —9go ° ° —30 
7) (5) — 240 1728 — 13824 ‘a —6 36 —12 40 —240 1728 48 —252 1728 
7) (4) (1) 216 — 1800 15840 i a —36 2 —12 192 — 1800 —12 126 — 1380 
7) 33 2) 128 —840 6720 7 6 —18 12 —36 160 — 960 —48 228 — 1360 
7) (3) (1)? —48 600 — 6720 ‘ ‘ 18 ‘ 4 —96 960 . —36 640 
7) (2)* (1) -72 540 — 5040 4 ‘ “ B 4 —24 180 12 —72 510 | 
7) (2) (1)* 16 —240 3360 : ‘ ‘ ‘ é 8 —120 A 6 —140 / 
7) (1)* ’ 12 —336 ‘ = ‘ A s ‘ 12 . . 2 
6)* —120 840 —6720 3 —6 18 —2 22 —120 840 16 —120 840 
{8} (5) (1) 168 — 1584 14784 ; 3 —18 —20 168 —1584 --8 138 — 1320 
6) (4) (2) 180 —I110 8400 - 3 -9 3 —21 114 —810 —32 168 — 1080 
eS 4) (1)* —72 810 — 8400 ° : 9 : 1 —60 810 . —24 480 
6) (3)? ° — 280 2240 -6 9 —24 2 —24 112 — 640 -8 904 — 640 
(6) (3) {a} (1) —88 720 — 6720 x -3 9 a 20 —116 840 8 —132 1060 
(6) (3) ()* 8 — 160 2240 : -3 ; 20 —280 % 2 —140 | 
3 ete —18 105 —840 a -1 I —3 15 8 - 120 
6) (2)* (1)* 24 —225 2520 ° —1 6 —45 ° I — 180 
{6) (2) (x)* -2 45 —840 ; —1 15 ; 20? 
6) (1)* ; —1 56 ‘ —1 : 
(5)* (2) 72 — 504 4032 > 6 —10 8 —360 —2. — 528 
Ag {2}, —24 60 — 4032 3 . : 2 - 4 bm : _% pi 
5 {4} (3) 120 —804 8 6 —36 6 —28 172 —1128 —24 150 — 1064 
5) (4) (2) (1) —132 1044 — 9072 . . . 8 —— 720 12 —99 840 
(5) (4) (1)? 12 —240 3024 7 : ; a“ 8 —240 ; 3 — 100 
3} (3)* (1) —24 240 —224¢ -3 18 * 12 —88 624 4 -5 520 
5 23} fay —32 210 — 1680 “ : -6 & —30 180 24 -7 404 | 
85} 3) (2) (1)* 24 — 300 3360 . —4 36 — 360 . 36 — 400 
(s AEA F 30 — x60 : -2 60 . “ 20 
(5) (2)* (1) 12 —90 40 : : —~ 15 —90 
5) {2)* (1)* -4 60 —840 - * js ‘ ‘ é - 
5) i) (1)* e -6 168 A a ° ‘ . ° : b 
5) (1)? e 2 -8 js . 5 ‘ , : ; . ‘ 
4) 30 — 180 1260 ° —-6 ° —-2 24 — 180 -3 18 — 150 
(4)* (3) (1) —56 420 —3360 : 18 . 4 —72 600 3 —36 430 
(4)* (2)* —33 180 — 1260 - é ‘ 2 —12 90 6 —27 165 
(4)* (2) (1)* 30 —270 2520 . ° a ° 12 — 180 ° 9 — 150 
i 1)* ) 3 30 — 420 ‘ ° . * ° 30 ° : 5 
4) (3)* (2 —2 160 — 1120 -3 9 - 1 —58 ° 12 —6 0 
G {3s {is 12 —120 1120 : -9 : a 36 ~33e : 4 -~a 
4) (3) (2)* (2) 32 —210 1680 a . - — 180 _ _ 
4) (3 i} (i)* —8 100 —1120 ° ° : 4 120 : oe = ' 
4) (3) (2)* “ -6 112 : ‘ é . —12 ; : -1/} 
4 3 —15 105 ; ° j : . —1 - 
4 ye (Ye -6 45 —420 is ° ° ‘ . ‘ -3 > 
4) (2)* (1)* Z —1§ 210 ‘ ° » ° . ° " —§ 
4) (2) (1)* 15 Z —28 ° ° ° ° ° ° ° . 
4): 1)* 210 28 Z i : ‘ ° ‘ . : . 
(3 ° ° ° Z —1 2 2 -8 40 -6 40 
<3)" (2) (x) I r -3 ~ 4 20 -120 18 —140 
3)* (1)* ‘ I 3 r = ‘ ~ ; : 
3 2)* ° 4 < 2 I -1 : -%3 -4 Ir ~9 
3)* (2)* (1)* a 2 4 . 1 I -6 45 s -9 i 
3)" apcay e 4 16 4 3 6 z —1I5§ F ° -1}) 
3)* (1)* ° 10 60 20 15 45 15 Z ° . . 
(3) (2)* (1) . ‘ fi 4 ‘ ; ir I -3 15 
(3) (2)? G)? . 6 18 ‘ 10 9 i ‘ 3 Z —10 
3 2)* (1)* 5 . 20 100 20 40 70 10 . 15 10 1 
3) 2) (1)" 105 7 : ze 490 140 210 525 140 7 105 105 21 
(3) (1) 1890 252 9 280 2520 840 1260 3780 12 84 945 1260 378 
2)* ° ° 
2)* (1)? P ° ° : 20 ‘ % ° 4 i 
A 1) 6 ‘ 24 96 ° 88 72 . a. 16 : 
2)* (1 go 3 120 720 120 60 720 go 5 330 200 8 | 
2)* (1)* 1120 84 I a gee 1120 atoo 6160 1400 56 2520 2240 392 | 
(2 qo” 12600 1470 45 2800 28000 8400 18900 5 14700 840 22050 25200 6300 
(a 103950 13860 495 15400 | 184800 61600 | 138800 415800 | 138600 9240 | 207900 | 277200 83160 















































———— 





a 
ee eel 


F 
. N. Davin anp M. G. Kenpat 
































w= 12 (vi) Table 1.12 (cont 
[3217] 31") -) 
12) a [2*1*] [atx 
(11) (1) 40320 = cinhs [2*1*] [2? 
(10) (2) —35280 om el 2*1*} far") 
ey A " ssete + seme 720 —_— [4] 
12 . ~ 
(9) (2) (2) a — 181440 144 —624 2880 a — 362880 
) ap 17640 161280 : 24 3600 £.’ 240 322560 3628800 
(8) (4) 4200 ~~ : —240 ee i 1760 — 3628800 — 39916800 
(8) (3) (2) — 15120 60480 : 240 asses | —soe | eteuhe —2177280 43545600 
(8) (a 16800 136080 : “7 20160 161280 181 23950080 
5040 — 166320 ° ~300 96 <a oe — 23950080 
(8) (2) (1)* — 45360 . & 1908 ~aan 40320 2217600 17740800 
(8) (x)* —7560 a 270 — 1200 mi 136080 — 604800 — 26611200 
(7) (s) 840 oo ee Bego | i | 2 sotbios 
7) (4) — 1382 —IStz0 : —30 —70560 1663200 = 
2)(3) 3 13860 124416 2 612 i 680400 199 
Oya — 5 ~240 = — 90720 oe 
(7) (2)? (1) —7560 — 86400 120 1728 ~ 138 —8400 — 1134000 
(7) (2) Gy —4410 86400 ° 240 — 1152 1886 124416 151200 14968800 
(7) (x)* 2100 45360 : 7 — 1632 a — 126720 — 1244160 —aaptiee 
(6)? —126 — 30240 - —120 258 —ae — 105600 1425600 _ 
~6720 3024 865 | —For0 estoo | 1936800 | — yeorees 
(6) (5) (x) ‘i . : i and 1800 66240 — 864000 11404800 
(6) (4)( 2) 12936 ‘a = — 20 : 4 — 26880 — 712800 11404800 
Say | “Ee | “Be | ss] oh | oot | ane “isgeee 
2 - ee: ~3 
(6) Gy) (2) 4760 75600 wag 280 = segs 1108 — 624800 $70240 
(6) G) se — 9240 — 40320 . —20 —I§i2 pha 8 —118272 6652800 
(6) (2)? 2240 90720 . 40 312 =a ped —92400 1330560 
(6) (2)* (1)* -% | “ashe o, a1 a od $8800 szosto | —rs9667a0 
) (2) (1) 1890 7560 . : — ~ae — 40320 — 756000 
6) (1)* —420 — 27680 ” -80 —32 ree oo800 403200 79200 
14 560 . 20 336 —2040 —20160 —— tokBoo — 4435200 
(s)? (2) — 504 : — 264 2790 15960 302400 _13305600 
(5)* (1) 4027 = 8 —300 — 31080 —1§1200 35200 
(s) (4) (3) — 2024 36288 . 5880 378000 1663200 
(5) (4) ( 316 36288 120 . — 168 — 100800 
sth {2} (1) —7938 ~J2578 —720 5040 1663200 
ey tf}: @) a 10 120 144 sas —44352 —s8 
5) (3) (2)* — 4648 — 27216 —120 — 900 7 282. 435456 
(5) (3) (2) (x)? —2814 44352 1008 _ 7992 <a —362 —47900%6 
(s) (3) a 4284 R : < robo _728760 ess 
5) (2)* (1) — - Bo64 - aaa ra —3600 ~e pene Bm 
(s) (2)* (2)? —7560 : — 288 Dn 37632 — 443520 —3991 
(5) (2) (1) —525 40 : ee - aes —36 5322240 
(5) (1)? 63 7560 —240 1964 4480 3991680 
4)° —1 1512 . : —15702 - — 7983360 
(4)* (3) (2) 1260 72 : : 48 — 166320 1330! 
fay (ay — 4200 ~~ —33 : . " 97 — 1995 
(4)? (2) (x)? — 1260 41580 15 30 ‘ 36 ~aae — 136080 
Ms (3h 1) 1890 11340 : -~30 — 162 a. a. 21168 199. 
(4) (3)? (2 —210 —22680 45 95 312 -3 — 11340 —720 —3991 
(4 Ge », — 3010 3780 z 1 333 ——— 3 113. I 
( 2 2310 _— “ — 162 po I —415 — 1247400 
4, — 25200 - 3 é - = 
‘) BOG 2730 * 450 — 3590 “3100 283500 1871100 
(4) (3) (1)* 1) — 1330 — 26460 —72 — 30800 — 37800 —3742200 
(4) (2¢ 84 17640 60 er — 19600 — 302400 623700 
{4} (2)? (y* 105 — 1704 : —480 40 252000 3326400 
4) (2)* (1)* —315 —945 — s 48 _ — 38640 — 3326400 
‘8 2) (1)* 105 3780 5 20 . 18 15680 415800 in 
4) (x)* -~7 — 1890 . —10 -- — 784 — 226800 4989600 
3)° : 252 -vs — 3045 17640 see 
(3) (2) (2) —280 —9 —6 r 5 be 28350 —332 
1120 2240 . ae —2310 —94500 311850 
(33) (ay: — 10060 . > 140 37800 _1247400 
3) (2)? — 280 a4 —a90 Ss 3780 623700 
(3)* (2)* (1)? 315 3360 —96 08 2240 90 83160 
ye 3 aS —735 —— — — 11200 —_ —2970 
3)* (x)* 175 7560 20 . ai 123200 246400 
Bra as | —tie :| <i | aed se ~ see 
3) (2)* (1)? — 105 168 72 —g00 — 5320 —33600 
b) 2)" (1)* 105 945 ‘ . 10360 50400 492800 
(a) ~21 he —10 : ™ a — 126000 ‘eee 
(3) (x) r 378 : 56 aa 6 33600 1663200 
- = —" 220 3360 ~1680 -s 
- z : —a —sbeo —34650 3 
aye 1) . 448 37800 415 
2)* (x)* : ; : —16 —8820 — 554400 
(2)* (1)* ‘ : . —! ~ 600 166320 
(2)* (x)* : . : rz 3 ai —10 Is 
8 (x) 16 . 6 —6 - 105 440 
» 480 . -.. 45 ws ~<a —420 — 945 
7920 4 : 420 s : 210 4725 10395 
220 : 945 4725 210 z em —3150 —62370 
sues 62370 3150 = r 630 S1975 
$1975 13880 45 —4s a 
Zz s 
1485 66 ga | 
£ 





























29-2 











[ 450 ] 


MISCELLANEA 


On the efficiency of the method of moments and Neyman’s type A distribution 
By L. R. SHENTON 


In a paper by Neyman (1939) several types of compound Poisson distributions are derived and their 
application to empirica! data mentioned. Neyman remarks that the method of fitting requires investiga- 
tion. It is the object of this note to consider the efficiency of the method of moments and Neyman’s 
type A distribution of two parameters. 

The efficiency of the method of moments, with particular reference to parameters of scale and location, 
has been discussed by R. A. Fisher (1921), in connexion with the Pearson system of frequency curves. 
More recently Fisher (1941) has used the covariance and information matrices to find an expression for 
the efficiency of the method of moments applied to the negative binomial distribution. The chief difficulty 
appears to be the evaluation of the information determinant, and the process given here may have 
applications in other cases. 

1. NEYMAN’S TYPE A DISTRIBUTION 
The probability function is given by 








@ 
exp{—m,[l—e™-)}} = x P,#*, (1) 
a2=0 
‘ e~™ m4” { A’l® AV AB 
with P= er aes y+ Ti +o torte ° (2) 
where V=me™, O=1, «=0, 
='@, x+ 0. 
The cumulants K, are given by 
ao @ 
xX K,¢/ri= xX m,myx(et—1)*/s!. (3) 
r=1 s=1 
The first few are recorded for later use 
K, =m ™,, K, = mym,(1+m,), 
K, = m,m,(1+ 3m, +m), 
K, = m,m,(1+ 7m, + 6m} + m3), (4) 
K, = m,m,(1+ 15m, + 25m2 + 10m} + m}), 
Kg = m,m,(1+ 31m, + 90m} + 65m} + 15m + mi), | 
. - ,er—1) \ 
The relations K,4, = maj K, + 7K, + . K,_3+..- +E TH) (5) 
dK. 
and Ki, = m{K, 4 =a} : 
Om, 
are easily proved from (3), the second expression being useful if high-order cumulants are required. 
By the method of moments m, and m, are estimated from 
A, = mm, A, = m,m,(1+m,), 
where A, and A, are the first two sample moments about the mean. For large samples of n, we find 
nvarm, = m,{2 +m} + 2m,(1 + m,)*}/m?3, 
nvarms, = {2+ m, + 2m,(1+m,)*}/m,, 
N COV (My, Mg) = — 2{1 + m,(1+m,)*}/m,, 
so that the covariance matrix 
[ varm, cov (m,mMz) 
cov (m,, M_) var mM, 
has a determinant of value 
{1+(1+ mg)? + 2m,(1 + mq)3}/n?m,. (6) 


To find the efficiency of fitting the first two moments, we require the determinant of the information 
matrix. 














By di 


The Ii 


and s 


with 
to co 


Ifm, 
PAn 


Thu 


then 
whe 
in tl 


V 


wh 


anc 


Sir 


wl 
m: 





EE a 





) 


Miscellanea 451 


2. LIKELIHOOD EQUATIONS AND INFORMATION MATRIX 
By differentiating the G.F. (1) we find 


aP, aP, (x+1) 
am, = {aP,—(x+1) P.43}/m, and om = => * oe Posy (7) 








The likelihood of the sample (no, 7;, %, ..., %,) is 


n 
L= Il (P.)™, 
ss z=0 
and so for optimum statistics m,, m, 


Pp a 
©’n,(x +1) Fe = nm,m, = nA,, 
z 
with 2’ indicating summation over the sample. m,m, is therefore efficiently estimated by the mean, and 
to complete the solution we must approximate to 


‘n(x + 1)— eet = ni,. (8) 


pip and m, are estimates by moments, they can be improved by (8). For m,m, = ™, m, = A,, and writing 
P{m,) to indicate P, with m, = A,/m,, we have 








En te 1) ee a 1 L'n e+ 1) Ze _t Ys peter ree Cel. (9) 


am, Pm) mm, Pp m P, re 
A io + 7 A 
Thus if F(m,) = L’nf{x+ = —ndA, and m, = m,+dm, 
then dm, = — F(m,)/F’(m,), 


where (9) gives F’(m,). The improved solution is therefore obtained by the use of frequencies calculated 
in the moment solution. 


We now proceed to consider the efficiency. Pei equations (7) and (4) we have 


fg 1 eP, ; . 
a 1 @P,aP. 

E{ ——— log P,} = E|— — —] = 1 —¢/ g =. 
( am, Om, a ). ( am, ss) +7 “omy 

e 1 @P,\* 
#(- = log P.) = B( 5 =) = m,(1+m,)+0/mS = inns 
ome P, em, Me bia 

where ¢ = E{(x+ 1)? P2,,/P3, 


and for the information determinant 


| ne ni 
ay > |= n*{(1 + mg) Pp — m, mZ(m, + Mm, My + M,y)}/m, m3. (10) 


Nimm, Nm, | 


3. THE EVALUATION OF ¢ = E{ {(x+1)* Pai 





” oh aioe ms A’ AQ = Y’33z 
Since 5 =e a 07 Th +--+ 
ms 
=em pe {Ag +A,x+ Aga(x— 1) + Ag2(x— 1) (x—2) +... }, 


where the A’s may be determined by the Gregory-Newton formula of interpolation, it is evident that we 
may set up orthogonal polynomials with respect to P,, defined by 


O,(x)9,(z)P,=0, r#s, 
0 
+0, ree 








452 Miscellanea 


with @, = 1, say. With these polynomials we may then find an expression for (x + 1) P,,, in the form 
{By + B,A,(x) + B,O,(x)+...} Pes (11) 
where the B’s are functions of m, and m,. In fact we have 
E(e+ 1) Pey19,(x) = B, LO) Py. 


Squaring (11) and summing, we have 
@ 2 
¢= 2 (a+ yrs = BRIP,+ BIDG(«) P,+ BiUO(x) P,+ BIXO3(x) P,+.... (12} 
= 


This is a series of positive terms, the first two of which amount to m,m3[m,(1+m,) + m,.]/(1+mz,), and 
thus our expression (10) for the determinant of the information matrix is a series of positive terms. The 
first term of significance turns out to be the fourth in (12). The value of (12) can therefore be found in 
two stages; (a) the determination of the orthogonal polynomials, (b) the evaluation of the B’s. 

(a) The evaluation of the 6’s will be illustrated by finding 0,(x). It appears most convenient to assume 


A(x) = (wx—A,)* — ps + A[(x—A,)* — Hq] + Ble —Ay], 


where A, is m,m, and the p’s refer to the moments of P, about the mean. The orthogonality conditions 

















—s — Oy(x) + (@—Ay)> — Wg + A[(w—A,)? — Hy] + B(x—A,) = 0, ) 
Ms +Aps + Bus = 0, (13) 
Ms — Us, + Alp — 44) + Bus = 0, 
—ZOG(z)Prtte—Hs +AlMs—MsHe]) + By, = 0. 
The moments in (13) may be replaced by cumulants, using well-known formulae 
K,= fy, Ks = fs, Ky + 3K; = fy 
K,+10K,K,=p5, K,+15K,K,+10K}+ 15K} = p,, 
etc., and these in turn expressed in terms of m, and m, by means of (4). 
(6) For B, we have Bs E 2) P= > (x+1) P24,95(x), 
"é 0 
so that we require the value of U(z —A,+A,) O,(x— 1) P,. 
Expanding and simplifying, this becomes 
My — 3ptg + 3(1 — Ay) Hg — Ay + Aly — 2tg +A] + Blwg—Aj). 
Hence from (13) we have — 33+ 3(1—A,)ug—A, —2wet+A, —A, |? 
Ms Bs Pa 
B? 5 Hx) P, = Hs— Este Ma H5 Bs 
0 Ps at) Bs Hs Be 
| Ha-Ha Ms Ms—Hsh2 IaH Ms 
He-Hs Ms—Hsl, Ma 
_ 4m, ma[ 1 + m,(1 + my) (4+ m,)}? 
~ [2+ 2mm, + m3 + 2m,(1+ m,)*J]a’ 
where &% = 12+ 12m, + Gmz + 2m} + m,(48 + 144m, + 144m? + 80m} + 25ms + 2mé) 
+ 24mi( 1 + mg)? (2 + 2m, + m2) + 12m} 1 +m,)*. (14) 


For the earlier polynomials it may be verified that 
A(x) =1, TAR(xz)P,=1, Bz = mim, 

O,(x) =2—m,m,, BPXOx) P, = m,m(1+m,), 

(1+ 3mm, +m) 


ee oe ee 
0,(x) = (x —m,m,) oan 


(x —m, mg) —m,m,(1+m,), 


m,m$ 
BiLO(x) P, = — . 
(1+ mg) (1+ (1+ m,)? + 2m,(1 + m,)*] 











Inserti 


where 
Tab 


and 7 
of 90 
90 %, 
comp 
likelil 


That 
they! 
The s 


It is 
suita! 
a dist 
sickn 
Th 


from 


with 


so th 








Miscellanea 453 
Inserting these values in (10) and multiplying by the covariance determinant (6), we have for the efficiency 


a> 1 eee nee eee ae, 





where @ is given by (14). 
Table 1 gives an upper bound to the value of percentage efficiency, Z, for various values of m, 























Table 1 
Xm | 
0-1 0-5 1-0 2-0 3-0 4-0 6-0 10-0 
my ie 
0-1 96 93 93 93 94 95 96 97 
0-2 92 88 87 88 89 91 93 95 
0-5 82 76 77 80 83 85 89 92 
1-0 73 67 69 76 80 83 87 91 
2-0 65 61 67 75 81 84 88 92 
3-0 62 60 67 | age FS 85 89 93 
5-0 59 | 59 68 79 84 87 91 94 




















and m,. It is clear from Table 1 that if m, is small, say less than 0-2, then £ is generally in the region 
of 90% or above. For 0-2<m,<1-0, E may be as low as 70%, but otherwise it lies between 75 and 
90 %, and for these values it is not easy to decide whether the ‘improved’ fit is worth the additional 
computation involved. If m, > 1-0 and m,<3-0, EZ may be less than 70 % and in this case the maximum 
likelihood approximation (9) can be applied. 


4. A NUMERICAL ILLUSTRATION 


That Neyman’s type A distribution of two parameters and the negative binomial are related, in thai 
they may arise as the result of heterogeneity in the population, has been pointed out by W. Feller (1943). 
The similarity of the two may be appreciated by comparing the first three moments: 


Neyman’s type A Negative binomial 
parameters m, and mg parameters m, and m, 
fy MM, MMs 
He mym,(1+mg,) ’ mM, m,(1 + mg) ‘ 
Hs mym,(1+ 3m, + mj) My M4(1 + 3m_ + 2m?) 


It is thus reasonable to suppose that data satisfactorily described by the negative binomial may be 
suitable for our present purpose provided the parameters satisfy the conditions m,> 1-0, m,< 3-0. Such 
a distribution is given by Ove Lundberg (1940) concerning insurance claims and incapacity caused by 


sickness or accident. 
The moments of the distribution turn out to be 
A, = 2-805,871, A, = 6-404,549, 
from which, using the method of moments, we obtain the estimates 
m, = 2-187,723, m, = 1-282,553. 


Frequencies for these values are obtained by using the expression due to Beall (Neyman, 1939), namely, 


m m2 
(x+1) Pay, = ™ mem Pet Te +5, Ps -3+ | > 
with m,m,e-™ = 0-778,1466, P,=e-*, where a=m,(l—e-™), 


so that P, = 0-206,7682. 








454 Miscellanea 




































































Table 2 
1 2 3 4 5 6 
Observed P, St . , 
“ frequency by moments by likelihood (moments) | (likelihood) 
0 187 217-29 197-15 4-22 0-52 
1 185 169-08 178-00 1-50 0-28 
2 200 174-22 181-27 3-81 1-94 
3 164 147-79 153-44 1-78 0-73 
4 107 114-14 117-67 0-45 0-97 
5 68 82-63 83-97 2-59 3-04 
6 49 56-70 56-49 1-05 0-99 
7 39 37°17 36-18 0-09 0-22 
8 21 23-44 22-21 0-25 0-07 
9 12 14-29 13-15 0-37 0-10 
10 11 8-45 7-54 0-77 1-59 
ll 2 4:87 4-20 0-44 1-83 
12 5 2-74 2-28 — — 
13 2\ 13 1-51 1-21 — — 
14 3 oii *** 0-63( 8°95 = — 
15 1 0-43 0-32 ne bab 
> 16 — 0-45, 0-31 —_— inion 
1056 ed Code | 17-32 12-28 | 
logis L — 985-948,937 — 984-902,555 — - 








The fit by moments is given in column 3 of Table 2. For an improved fit we set up a table of values con- 
sisting of (a) (vx + 1) P,,,/Pz, (b) (x + 1)? P2,,/P2, which is easily found from (a), (c) (x + 1) (w+ 2) Pais/Pe, 
in each of which x takes the values 0, 1, 2, ..., to 15. In the present case we have 


In(2+1) P,41/P2 = 2983-569,210, 


in,(x +1)? P?,,/P2 = 10550-201,104, 
in,(x +1) (x+2) P,,,/P, = 12126-916,881. 
Using these in (9) we find dm, = — 0-14863, and the improved estimates m, = 2-474,481, mig = 1-133,923. 


Since with these values In,(x+ 1) P,,,/P, = 2965-031,424, it is clear we are much nearer a rnaximum 
likelihood solution for which 
En,(a +1) P,4,/P, = nA, = 2963. 
The fit with the improved estimates is shown in column 4 of the table. The improvement is probably 


exaggerated by the value of x*, where we have grouped the last five frequencies. A more reliable idea is 
given by log;, L, shown at the bottom of the table, and it is evident that an improvement has been effected. 


REFERENCES 


FELLER, W. (1943). On a general class of ‘contagious’ distributions. Ann. Math. Statist. 14, 389. 

FisHer, R. A. (1921). On the mathematical foundations of theoretical statistics. Philos. Trans. A, 
222, 309. 

FisHEr, R. A. (1941). The negative binomial distribution. Ann. Eugen., Lond., 11, 182. 

LuNDBERG, O. (1940). On Random Processes and their Application to Sickness and Accident Statistics. 
Uppsala: Almquist and Wiksells. 


NeyMaN, J. (1939). On a new class of ‘contagious’ distributions applicable in entomology and bac- 
teriology. Ann. Math. Statist. 10, 35. 











Halda 
called 
found 
small, 
of var 
more 
Sev 
mete! 
varia 
sugge 
be es 
of th 
varia 
to pc 
large 
easy. 
Le 
func 
obse 
Zn* 
is fi 
and 
first 


It v 
anc 


i.e. 
est 


an 








Miscellanea 455 


Large-sample theory of sequential estimation 
By F. J. ANSCOMBE 


Haldane (1945) and Finney (1949) have considered a sequential method of sampling a binomial population, 
called inverse sampling, in which sampling terminates when a specified number & of individuals have been 
found possessing the attribute. If the proportion @ of individuals with the attribute in the population is 
small, @ is estimated with coefficient of variation depending only on the value of k. Thus the coefficient 
of variation of the estimate can be specified in advance. Tweedie (1945) has considered inverse sampling 
more generally. 

Several! recent papers in the Annals of Mathematical Statistics have dealt with the estimation of para- 
meters in sequential sampling; in particular, Blackwell & Girshick (1947) have given a lower bound to the 
variance of an unbiased estimate based on a sufficient statistic. A general problem of sequential estimation 
suggests itself, namely, to formulate a rule of sampling such that an unknown population parameter can 
be estimated with specified accuracy and with minimum expected sample size, whatever the true value 
of the parameter. The accuracy of estimation might be specified by width of confidence interval, or by 
variance or coefficient of variation of the estimate, or in any other such way. It is the object of this note 
to point out that, while an exact small-sample solution of the problem is in general very difficult, the 
large-sample theory (valid for large expected sample size and high accuracy of estimation) is relatively 
easy. 

Let us consider the estimation of the mean @ of a population of which the variance is a known finite 
function v(9) of 0. The observations will be denoted by 2,,2,,..., and the cumulative sum of the first m 
observations by Z,,. Sampling continues until a certain inequality is satisfied, of the form Z,,>(m) or 
Zm<k(m), where k(m) is a function of m determined in advance. The value of m for which the inequality 
is first satisfied will be denoted by n. If we represent the sampling on a diagram in which m is abscissa 
and the ordinate is denoted generally by y, n is abscissa of the point where the sample path y = Z,, 
first meets or crosses the boundary y = k(m). We shall estimate 6 by 


i (1) 
n 


a 
It will be shown that, under certain conditions, 6 is asymptotically normally distributed with mean 6 
and variance v(?)}/n,, where ng satisfies 


k(n) = Ong, (2) 


i.c. Ng is abscissa of the point where the mean path y = Om intersects the boundary. @ will therefore be 
estimated with specified variance a* if the equation of the boundary is 


1 
—Wwv (*) = a’, (3) 
m \m 
and with specified coefficient of variation 6 if the boundary is 
m [y ; 
—yi~}— b?. 4 
7m) ? 


To establish this result (a sketch proof only will be given), we consider a sequence of sequential samplings 
performed on the same population (so that @ is contant) and défzced hy a sequence of boundaries y = k(m) 
such that ny 00. By (2) we have | k(n 9) | > 00 also, if 00. Let us suppose that the boundary system 
satisfies the following conditions: 

(i) The boundary is approximately lincar in the neighbourhood of m9. More exactly, if m is confined 
to an interval (ny—c4/N, Ng +c¢ Vo), Where c is constant, then as my > 00 


k(m) = k(m9) + (Mm — re) &’(mg) + OI). (5) 
(ii) Wecan ignore the possibility that the boundary wil! be crossed elsewhere than in the neighbourhood 


of ng, i.e. by choosing c sufficiently large we can make the chance arbitrarily small that |n—n, | >‘ 
if mo is large. 








@ 


456 Miscellanea 


(iii) The boundary crosses the mean path y = 0m at a non-zero angle. More exactly, k’(n,) tends to 
a limit not equal to 0 as ny -> 00. 

It has been assumed here that the limit of k’(ng) is finite. If it is infinite, condition (i) must be reform- 
ulated: if k(m) is confined to an interval (k(n) —c./k(n9), k(n) +¢./k(n9)), where c is constant, then as 
Ng > 00 


m = n+ O(1). © (6) 


Conditions (i) and (iii) are in fact satisfied by the boundaries defined by equations (3) and (4). For the 
first of these, 





(0 
ni , (7) 
0 
k(n») = 0+ 5 (8) 


and if we consider a sequence of boundaries for which a > 0, we have ny > 00, k’(mp) is a constant (finite or 
infinite) not equal to @ if v(@) > 0, and (provided £’{n,) is finite) k”(n9) = O(n>*). Similarly for the second, 
when (+0, 9 
v 
Ng = nk (9) 
6v’(A) — (8) 


king) = Ov'(0) — 2v() ’ 


(10) 


and the same conclusions follow. Condition (ii), however, is not always satisfied by these boundaries, 
and they may need modification, as described below. 

Assuming that all three conditions are satisfied, we consider first the case lim k’(n,) finite, and show 
that if m has any value in the range | m—n,|<c,/n, 


prob(n<m) = prob(Z,, >k(m)) +0(1) (11) 


as Ny > 00 if k’(n,) <9, while the inequality on the right-hand side is reversed if k’(n,)>0. Here Z,, is the 
cumulative sum of m observations, taken without regard to whether the boundary has been crossed or 
not. Let us suppose that k’(n9) < 0. Then the left-hand side of (11) exceeds the first member on the right- 


hand side by the chance that a path reaches the boundary for some n<™m and then recrosses it to give 
m 


Zm<k(m). This requires that ] xz, shall be less than (m—n) k’(ng), while its mean is (m—n) 6 and its 


i=n+1 

variance is (m—n)v(9). Now as n,->© the variance of n is of order mg, i.e. tends to a finite non-zero 
multiple of n». Hence, applying the central limit theorem, and integrating over values of n<m, we see 
that the chance in question is of order ®(—m*), where ® denotes the normal integral, and so is o0(1). 
Similarly if k’(n) > 0. It may be noted in passing that if z; > 0 always and ifk’(m) < 0 for all m, the boundary 
once reached cannot be recrossed, and equation (11), without the o(1) on the right-hand side, gives the 
exact distribution of n in terms of the distribution of Z,,. 

Now the distribution of Z,,, is asymptotically normal with mean 0m and variance v(#)m. Let us set 
m = m+, where | w| <c./no. Then asymptotically 
(12) 


prob (Z,,>k(m)) = prob (Z_>k(n9) + uk'(n9)) = o( (O-k rt) ; 


[(m%29 + #) o(8)}* 

This takes values other than 0 and 1 only when gz = O(,/ny), and so we may neglect the win the denominator. 

Thus n is asymptotically normally distributed with mean n, and variance n9v(@) (0 —k’(n9)]-?.. But from 

(1) we have asymptotically 

(Nn) + wk’(n9) 
Not 


6 = = 9--(0-k’(n,)). (13) 
No 

Hence we have the desired result that dis asymptotically normally distributed with mean @ and variance 
v(9)/no. 

If the boundary is wholly vertical, the result is still true, since the sample size n is now fixed. It can 
also be established if lim k’(n,) = 00 for some value of 6, using equation (6). 

We can replace ny by E(n) in the variance. The asymptotic variance is then equal to Biackwell & 
Girshick’s lower bound to the variance of an unbiased estimate of 0. We may also replace k(n) by Z,, 
in the numerator of (1). 





— TT 





wi 


a 
be 


mDof et Sf & 











Miscellanea 457 


Let us now consider some examples of boundaries. First suppose v(@) is a known constant. Then 
equation (3) represents a vertical boundary, i.e. fixed sample size, with 


m = v/a*. (14) 


Stein & Wald (1947) have demonstrated the optimum character of this boundary. Equation (4) gives 
a parabolic boundary, y® = (v/b?) m. (15) 
This boundary, as it stands, passes through the origin. The above theory will apply if we can ignore the 
possibility that the boundary will be reached near the origin :nstead of near m = n,. The boundary should 
therefore be modified for m small so that it does not approach very close to the origin. If | @| is not too 
large 6 will be estimated with the specified coefficient of variation 6, but it will be estimated with greater 
accuracy than required if | @| is so large that the sample path is likely to meet the modified part of the 
boundary. 

Next, let us consider the Poisson distribution, for which v(@) = 6. Equation (3) gives a parabolic 
boundary, 


y = am’, (16) 


which again needs modification for m small (so that if 6 is near 0 it will be estimated with lower variance 
than a*). Equation (4) gives a horizontal boundary, 


y = 1/b*. (17) 


For the type III distribution of a variance estimate of v degrees of freedom from a normal population 
with variance 6, we have v(@) = (2/v) 6. Equation (3) gives the curve 


y? = pva*m’, (18) 
which needs modification for m small. Equation (4) gives a vertical boundary with fixed sample size 
m = 2/(vb?). (19) 


Finally, let us consider sampling a binomial population in which a proportion @ of individuals possess 

a certain attribute. Then v(@) = 6(1—9). To obtain a specified coefficient of variation b of 6, we use the 
boundary given by equation (4), namely, 

1 

= 20 

d b? + 1/m’ _ 

but modified near m = 0. For 8 small, so that ng is large, this is practically the horizontal boundary of 

‘inverse sampling’, which, as already remarked, gives nearly constant coefficient of variation if @ is small. 

Suppose now that we wish to estimate @ with coefficient of variation 6 if 0 is small, and to estimate 1-6 

with the same coefficient of variation if 1 —@ is small. We shall meet this requirement if In {@/(1 — 6)} is 

estimated with constant variance 6? (assuming, as usual, that the variance is small). Now asymptotically 


var [In {6/(1—6)}] = (0. —0)]-2 var (0) = [01 — 4) n9]-, (21) 


and this will be equal to 6? if the boundary satisfies 


bmn “( l -*) = 1, 
m m 


or x+y = b2y, (22) 


where x = m—y. If, therefore, we plot the number of individuals found possessing the attribute as 
ordinate against the number without as abscissa, sampling proceeds until the rectangular hyperbola (22) 
is reached. For @ near 0 this is nearly the horizontal straight line y = 6-*, and for @ near 1 it is nearly the 
vertical line z = b-*. 

To sum up, we have considered the estimation of a single unknown parameter. We confine attention 
to statistics which are the sum Z,, of the observations z,; in some scale of measurement. If we set E(z,) = 8, 
then var (z,;) can be expressed as a function v(@) of 8. We find that the variance of Z,/n is, for n large, equal 
to v()/n, whether n is fixed in advance or determined by a single-boundary sequential procedure satisfying 
conditions (i)—(iii). If we choose a scale of measurement of the observations such that, in samples of fixed 
size, the unknown parameter is estimated with minimum variance, then the use of the appropriate 
sequential procedure will secure the desired accuracy of estimation with minimum average sample size. 








458 Miscellanea 


If there is more than one unknown parameter, and the variance v of the observations z, is finite but not 
a known function of the mean @, v must be estimated as sampling proceeds, and a boundary cannot be 
specified in advance. Let us suppose that v can be estimated from a sample of m observations by an 
estimate 0 that is asymptotically normal with variance O(m-1) as m > oo. When, say, half the sampling 
has been done that will ultimately be seen to be needed, ¢ satisfies in probability the relation 


® = (14+ O(n-)), (23) 
and so the eventual sample size n and estimate 6 of 0 satisfy 
var (0) ~ 3/n. (24) 
Thus the required value of n can be predicted at this half-way stage with asymptotic validity. 


I am indebted to Mr D. V. Lindley for helpful criticism. The treatment given above is somewhat 
heuristic in places, notably in the argument used in arriving at equation (11). It seems to me, how- 
ever, that the line of approach may prove to be worth pursuing. 


REFERENCES 


BLACKWELL, D. & Grrsuick, M. A. (1947). A lower bound for the variance of some unbiased sequential 
estimates. Ann. Math. Statist. 18, 277. 


Finney, D. J. (1949). On a method of estimating frequencies. Biometrika, 36, 233. 
Hawpang, J. B.S. (1945). On a method of estimating frequencies. Biometrika, 33, 222. 


Srern, C. & Waxp, A. (1947). Sequential confidence intervals for the mean of a normal distribution with 
known variance. Ann. Math. Statist. 18, 427. 


TweeptE, M. C. K. (1945). Letter in Nature, Lond., 155, 453. 


A historical note on the method of least squares 


By R. L. PLACKETT, University of Liverpool 


1. The purposes of this note are: 

(i) to summarize the justifications by Laplace, Gauss and Markoff of the method of least squares; 
(ii) to suggest that Gauss was the first who justified least squares as giving those linear estimates 
which are unbiased of minimum variance; 

(iii) to modernize and extend his proof to cover a general theorem due to Aitken. 

It is not my object to provoke controversy, and I have attempted to indicate where a personal opinion 
is intended. 

2. The method of least squares has been in use now for over 150 years. During the nineteenth century 
the writings of Todhunter (1865), Merriman (1877) and others gave the impression that Laplace (1812-20, 
collected works 1886) was largely responsible for putting the method on a theoretical basis by means of 
the calculus of probability, whereas the contribution of Gauss (1821, collected works 1873) was mini- 
mized or ignored. Lately, the emphasis has changed, and in recent papers and text-books Markoff (1912) 
is credited with justifying the method without superfluous assumptions of normality. For these reasons, 
it seems desirable tc disentangle the various justifications proposed and to allot credit in due proportion, 

3. In general let 6(s x 1) be a vector of unknown parameters, x(n x 1) a vector of observations, e(n x 1) 
a vector of errors and A(n x 8) a matrix of known quantities; so that 


AO-x =e. 


Further, suppose that W(n xn) is a diagonal matrix whose elements are the reciprocals of the error 
variances. It is required to form an estimate @* of 8. The method of least squares leads to estimates 


which satisfy A’WAO* = A’Wx 


Neither Laplace nor Gauss used matrix notation, but their results can immediately be written in that form. 





—_— 








— SS eh Che 








Miscellanea 459 


4. Laplace (1812-20) discusses the method of least squares in Book 2, Chapter 4, and in the first three 
Supplements. He proves a series of results which are summarized—lI hope fairly—in the followin: 

THEOREM. Among all s x n matrices F leading to estimates of the form FA®* = Fx, the expected values 
of the elements of | @* —6| are minimized as n > 00 when F = pA’W, yu being an arbitrary multiplier. 

The proof is long but runs on these lines: if u is the error of 6* then FAu = Fe; Laplace proceeds to 
determine the joint characteristic function of Fe and deduces that when all errors have the same dis- 
tribution, symmetrical about zero, Fe has a multivariate normal distribution as n > 00; whence u also 
has a multivariate normal distribution, and the expected values of the elements of |u| are determined; 
finally, he shows that F = 4A’W implies the vanishing of the differential coefficients of these expected 
values with respect to the elements of F. 

In more detail, Laplace first takes s = 1 and maximizes the probability that his estimate lies between 
given limits; he then notes that this is the same as minimizing & | @*—@|, and continues to use this 
criterion when s = 2, stating that the result can be extended to greater values of s. In the first Supplement 
he considers the possibility of a bias in the observations and suggests its removal by introducing an 
additional parameter whose coefficient is unity in all equations. 


5. Gauss presented his justification in 1821. The paper is written in Latin, but a French translation 
was published by Bertrand in 1855 and the fundamental theorem incorporated in Bertrand’s own book 
of 1888. In the early sections of his paper, Gauss also considers the possibility of bias in his observations 
and makes it clear that the preferred estimates are those with minimum variance, although of course he 
does not use this terminology. He begins with errors of differing variance, and by choosing suitable multi- 
pliers presents the equations in a form where the errors have the same variance. The proof of the following 
theorem is in Art. 20; it is implicit that he is seeking unbiased estimates: 

THEOREM. Among all the systems of coefficients B(s x n) which give Be = 0-61, the estimate 8 being 
independent of ®, those for which the diagonal elements of BB’ are minimized are provided by the method of 
least squares. 

Put E=A’e sothat § = A’AO-—A’x. 

The solution of these equations is 6 = 6* + DE, and with DA’ = E this becomes 
Ee=6-86* so (0*-—O6f) =(B-E£he. 


If this is true for all @ then (B— £) A = 0, i.e. on post-multiplying by D’, (B— E) E’ = 0, which implies 
BB’ = EE’ +(B-£)(B-—E)’. It is now clear that the diagonal elements of BB’ are minimized when 
B= E. 


6. Matrix notation has been adopted for brevity in the preceding sections, but no matrix theorems have 
beer assumed. Taking for granted the now familiar properties of matrices regarding associative products 
and inverses, the preceding demonstration can be modernized and shortened. 

If 6* = Bx is unbiased for all 6, then BA = I. With C = A’A it follows that C-! = BAC", so 


BB’ = (C-14’)(C14’)' +(B-C14’)(B-CAAY, 


i.e. the diagonal elements of BB’ are least when B = C-! A’, which is the solution provided by least 
squares. 


7. Markoff (1912) devotes Chapter 7 of his book to the method of least squares. He states that each 
observation is to be considered as a particular case of many, and as an unbiased estimate of some linear 
function of the unknown parameters. His determination of unbiased estimates of these parameters having 
minimum variance is closely followed in the paper by David & Neyman (1938). 


8. Aitken (1934) has extended the theorem of Gauss by proving that with a known matrix V of variances 
and covariances of the observations, the minimum of (4®@—x)’ V-(A6—x) provides estimates 8* such 
that ~* = P6* is an unbiased estimate of ~ = P@ with minimum variance. Gauss’s method can be used 


to prove this also. 
If p* = Bx then BA = P and 
[B] V[P(A’V-1A)-1 A’V 1)’ = [P(A’v- 4)" 2°) Via)" a7, 
consequently 
BV B’ = (P(A’V-1A4)-! A4’V-1) VLP(A’V 14) AV 1) 
+[B-— P(A’V-A)14'V)] D[B— P(A’V-1A) 7 AVY. 








460 Miscellanea 


If we consider the diagonal elements here, the second term on the right gives a positive definite quadratic 
form, so minimum variance is attained when 


B= P(A’V-*4)* AV, 
the solution given by the method indicated above. 
9. It is therefore my opinion that Laplace and Gauss proved theorems which are quite different; that 
the justification given by Gauss is preferable; and that Markoff, who refers to Gauss’s work, may perhaps 


have clarified assumptions implicit there but proved nothing new. It is evident that Gauss’s proof is valid 
for all values of n, entirely free from any assumption of normality, and capable of immediate development. 


REFERENCES 
AITKEN, A. C. (1934). On least squares and linear combination of observations. Proc. Roy. Soc. Edinb. 
A, 55, 42~7. 
BERTRAND, J. (1888). Calcul des probabilités. Paris. 


Davin, F. N. & Nryman, J. (1938). Extension of the Markoff theorem on least squares. Statist. Res. 
Mem. 2, 105-16. 

Gauss, C. F. (1855). Méthode des moindres carrés (trans. J. Bertrand). Paris. 

Gauss, C. F. (1873). Theoria combinationis observationum erroribus minimis obnoxiae. Pars prior. 
Werke, Band 4. Géttingen. 

Laptace, P. S., Marquis de (1886). 

Markorr, A. A. (1912). 
and Berlin. 

MERRM™MAN, M. (1877). A list of writings relating to the method of least squares, with historical and 
critical notes. Trans. Conn. Acad. Arts Sci. 4, 151-232. 

TopuuntTER, I. (1865). A History of the Mathematical Theory of Probability. Macmillan. 


Théorie analytique des probabilités, 3rd edition, Oeuvres, 7. Paris. 
Wahrscheinlichkeitsrechnung (trans. H. Liebmann), 2nd edition. Leipzig 


The characteristic function of a weighted sum of non-central squares of 
nermal variates subject to s linear restraints 


By G. I. BATEMAN 


1. Suppose 7,,7,, ...,%, are independent normal variables with expectations a,, dg, . 


--»@,, respectively 
and with unit variance, that is to say, we suppose that 


1 i 
P(x;) = Jam xP l— Hes— an" (j = 1,2, ...,%). 


We consider a weighted non-central sum of squares of the type 


n 
yy”? = x C;x}, 
j=1 


where the 2’s are as defined and the c, (j = 1, 2,...,) are constants. It is assumed that the x, are subject 
to s linear restraints 


n 
>» bi; t= Pi @= i a § 
j=1 


b,; and p, (l= 1,...,8; 7 = 1,2 


»+++»”) being given constants. The characteristic function of the joint 
distribution of y’*, p,, ps, - 


-+) Ps May be written down immediately. We have 

Plt, ty» ---»t,) = il aif fo o Van eoteca te A(ay— ay) + toyzh+ >» E inby2) as, |. 
The evaluation of the Pe is straightforward and we obtain 
8 


(ty ty, .--5t,) = Ti (1 2ite) +e + 2.- Sel eg perees y ‘ial 
hs Us = 3) exp (it 3 1 1 — ite, Saceens onthe te it) (1) 








n byb n ap 
wher Aim = ed qd B= 7 _, 
. 1 — dit, im, AW Sie (2) 














This 


plu, 
func 


Wri 


we 


whi 


It 


ri 





— ww 





— 


Miscellanea 461 


2. Bartlett (1938) considered the case of two variables u and u, whose joint characteristic function is 
o(t, ty), and he showed that the characteristic function of u when u, is fixes,-wtiich we denote by ¢(¢| u;), 
is given by 


i 7 att Plt, t,) dt, 
o(t |) = === 





+o 
i) e~fn (0, tae, 


—o 
This result may be casily, and obviously, extended to the case of s+ 1 variables, u, u,, Ug, ..., Us. Write 


p(u, u,, ...,U,) for the joint probability distribution of u, u,, ...,u, and P(t, t,, ta, ...,¢,) for its characteristic 
function. It follows that 


rPt+o +o 8 
Plt, &,,%-..9&) = oll exp [imi 3 >» tm | pu, Uy, ...,4,) dudu, ...du, 


—-@ 


e 
\ 
es 
+ +0 
= | “eee exp [ime 3 D> 4 P(t | Uy, Ugy ...5 Uy) (Uy, Ug, -.-, Uy) dude, ...du,. 
—-@ 


+00 
Writing P(E | 2), Ug, ...%,) = i) ef n(u | Uy, tg, -.., Uy) du, 
rs) 


+@ +a 
Glt.t,;8;...,8,) = | = | P(t | wy, thy, ...5 Uy) P(Uy, Ug, «.-5 Uy) EXP & tm | du, ...du,, 
—o —o =] 


whence, using the Fourier transform, 


1 @ oo 8 
P(t | ty, Ug, «.-5 Uy) (Uy, Ug, ..-5 Uy) = aa Ei exp| -i X tu, | Plt, ty, ...,t,) dt, ...dt,. 
(277)* J ~o -—@ l=1 


It follows that +0 +o 8 
i) me | exp| i X tu, | P(t, ty, ty, -.-5t,) dt, ... dt, 
l=1 





P(t | uy Ug, ...,%) = SS = ; . (3) 
J a exp| -i x ts G(O, t,, bg, ...»%,) Gt, ... de, 
-o —o l=1 


3. We proceed to apply this result to find the characteristic function of y’? of § 1. Substituting in the 
right-hand side of (3) the joint characteristic function of y’*, p,, M9, ..., 2, a8 given by (1), we obtain, after 
some reduction, that the characteristic function of yr’, given p,, Pg, ...5 Pgs iS 


Pyrlt | Par Pa» ---» Pe) 








+0 +0 8 13 8 
. . i} “a exp| i 2B Pdi-s >= ~ Ambit | yd 
> C;a; -| —o —@ sini m= 1 


ee +0 +o ~ 
jail ite, [ = i wo & a pis S camtt m | ty 


i=1m=1 


= (1 = Bite) exp | 


j=1 


n n 
where im = LG bsb,y and £,= & a,by, (5) 
j=1 j=1 


and A,,, and B, are as given in (2). The multiple integral in the numerator of (4) can be evaluated and is 


27 
or exP LBP A-AB~p)) 


where A is the matrix {A,,,}, = 1, 2, ...,8,m = 1, 2, ...,8, A~! the inverse matrix of A, | A | the determinant 


associated with A, (B—p) the column vector {B,—,}, ! = 1, 2,...,8, and (B—p)’ the corresponding row 
vector. The denominator of (4) may be evaluated in a similar way and we have finally 


Pyrlt | Pip +++» Ps) 


= Th (1-2ite)-texp| a & ae WS igyexrl- \(B-p)' A-(B—p) + K8—p)'a-"B—p)), 
j=1 4— 4 


j=1 
(6) 


where @ is the matrix {a,,}, 1 = 1,2,...,8, and § the vector {f,}, 1 = 1, 2, ...,s. The probability law of y’* 
for given restraints can be evaluated in certain cases. 








462 Miscellanea 
4. The distribution of the unweighted sum of non-central squares of normal deviates subject to s 


orthogonal linear restraints can be obtained as a special case of the foregoing. It has already been derived 
by Patnaik (1949) using a different method. We put c, = 1 for all j7, when 


n 
y= 2}, 
j=1 
where the z’s are as previously defined and 
E(x;) =a; (j =1,2,....n). 


If the restrictions are orthogonal, i.e. if 


n 
E bybmy =Sm=1 if l=m, 


j=1 


then from (2) and (5) we see that 


Aig = (1 — 2tt)—* pq, = (1 — 2it)-* Dg 





and B, = (1—2it) £,. 
Substituting in (4), it follows that 
zt 2% 1 8 is 
® ,(f| Pr, ---» Ps) = (1 — 2tt)-4"- exp | ——_ © a? -——__—__ & (A, -p,() —2it))?4+— 5 (2,-—p,)? 
?3 | Pr Ps) = ( ) exp [ay Ea m1 — ain ~% Pr w)) +37, 4 a 
i= 
ita ~ 
= (1 —2it)-Kn-0 td pt, 7 
( et) exp | 7 agtit & ot | (7) 
n s n 2 
where A= X a}- = ( x a,bu) 4 (8) 
j=. t=1 \j=1 


We recall that the characteristic function of the non-central x?, referred to as y”, is 


Ait 
Pyr(t) = (1 — 2it)-i" exp [| . 


Hence, using the uniqueness property of the characteristic function and the fact that there is a (1, 1) 
correspondence between the characteristic function of a variable and its probability law, it follows that 
when the z,’s are subject to s orthogonal linear restraints, 


n s 
= z7— = pt 
j=1 l=1 


is distributed as x’? with (n — s) degrees of freedom and with parameter A as given by (8). 


REFERENCES 


BarRTLeETT, M. S. (1938). J. Lond. Math. Soc. 13, 62. 
Patnaik, P. B. (1949). Biometrika, 36, 202. 








calcul 
a mea 
and y 
if the 


T! 
The: 





Miscellanea 463 


Intra-class rank correlation 


By J. W. WHITFIELD 
Psychological Laboratory, University of Cambridge 


Given a number of pairs of items each of which is measured according to some quality, the normal 
procedure to discover whether the arrangement in pairs bears any relation te the quality measured is to 
calculate the intra-class correlation coefficient. As an example we may consider pairs of brothers and 
a measure of their stature. If we wish to know whether there is any correlation between older brothers 
and younger brothers on stature, we calculate the product-moment correlation in the ordinary way. But 
if the more genera] question whether brothers tend to have similar stature is asked, there are no grounds 
upon which we can differentiate between each member of a pair so as to separate them into the two arrays 
for correlation. The mean correlation for all possible arrangements is required. This is the intra-class 
correlation which can be calculated without performing the numerous individual product-moment 
correlations. With quantitative data the procedure is not limited to paired data, but may be performed 
with arrangements of varying sized groups. 

The same type of problem can arise with ranked data. Consider eight pairs of brothers, ranked upon 
some quality which is not amenable to quantitative measure: 


Rank values 


First pair: land 6 
Second pair: 5and 9 
Third pair: 7and 8 
Fourth pair: 11 and 13 
Fifth pair: 2and 4 
Sixth pair: 15 and 16 
Seventh pair: 10 and 14 
Eighth pair: 2 and 12 


There are 256 possible arrangements for correlation, although only 128 need actually be calculated. 
These are distributed (using Kendall's 7 coefficient cf rank correlation—Kendall, 1948) as follows: 














Frequency S value 

& +18 

32 +16 

40 +14 

16 +12 

8 +10 

16 + 8 

x + 6 

128 + 1664 











Hence the mean S is + 13, and the mean 7 is + 04643. 

The computation of S for all possible arrangements would be excessively tedious for all data except 
those involving very few observations. The following alternative computation, though restricted to the 
case of paired data only, is much simpler. It can probably be extended to include family groupings other 
than two. , 

Rearrange the data in the example in order of the higher rank in each pair—i.e. first the pair with the 
highest ranked individual in it, next the pair with the highest ranked individual from the remainder, 


eat. iti (1,6) (2,4) (3,12) (5,9)- (7,8) (10,14) (11,18) (15, 16). 


Biometrika 36 3° 











464 Miscellanea 


Calculate S for this order (making no comparisons within pairs). For example, the first number, 1, has 14 
numbers on its right greater than itself, ignoring the number 6 which is bracketed with it, and the con- 
tribution to the score is 14. The second number, 6, has 10 greater numbers on its right and the contribution 
is 10. The third number, 2, has 13 greater numbers, ignoring the 4 which is bracketed with it; and so on. 
The positive score obtained in this way is 


14+ 10+ 134+114+104+348464+64644+4242+42 = 97. 


The maximum score is 112 and hence the score of the negative contributions is 112—97 = 15. The total 
score is thus 97 — 15 = 82. The maximum value S can attain is 4n(n— 2), where n is the number of in- 
dividuals. The minimum value of S is zero. This follows from a consideration of the extreme case of first 
paired with nth, second with (n—1)th, and so on, where the S contribution from the higher ranked in- 
dividuals in the pairs is 4n(4n— 1) and the S contribution from the lower ranked members is — }n(4n — 1). 
Furthermore, as will be seen from the sampling distribution, the mean value lies midway between zero 
and $n(n — 2), and the sampling distribution is symmetrical about the mean. These qualities of mean and 
range are true only for families of two members each. Taking the mean as origin, we can define 


. n(n — 2) 

8,=8-—— 
(using the suffix p to indicate paired data) and 

4S, 
Tt, = ——.. 
n(n —2) 
‘ 16.14 4.26 

In the example S,= ate el =+26, 7T,= +i¢ia = + 0-4643, 


which agrees with the value of mean 7 obtained from considering all possible arrangements for correlation. 


SAMPLING DisTRIBUTION OF S, 


The sampling distribution of S, has been calculated for n = 6, 8, 10, ...,20, and the tables are given at 
the end of this paper. A summation feature makes the calculation easy. The simplest case is that of n = 4, 
for which we have: 
Possible orders S S 
(1, 2) (3, 4) 
(1, 3) (2, 4) 
(1, 4) (2, 3) 


owe 


For n = 6, we have S (or S,) composed of the internal contribution of the second and third pairs, dis- 
tributed as for n = 4, together with the contribution of the first pair with respect to the second and third 
— The pair (1, 2) will occur with S values 8+ 4, 8+ 2, 8+0; 

The pair (1, 3) will occur with S values 6+ 4, 6+ 2, 6+0; 

The pair (1, 4) will occur with S values 4+ 4, 4+ 2, 440; 

The pair (1,5) will occur with S values 2+ 4, 2+ 2, 2+0; 

The pair (1,6) will occur with S values 0+ 4, 0+ 2, 0+0; 


or, expressed as a frequency table 


—y Te ee ee ee a “ 











“| 

First pair 

Ss S, —_____,— _—— —_——j Frequency 

(1, 2) (1, 3) (1, 4) (1, 5) (1, 6) 

12 6 J 1 
10 4 1 1 2 
8 2 1 1 1 3 
6 0 . l 1 1 3 
4 —2 1 1 1 3 
2 —4 1 l 2 
0 —6 l 1 






































on. 


dis- 
‘ird 











Miscellanea 465 


The sampling distribution for n is, therefore, the aggregate of (n — 1) sampling distributions for (n — 2), 
distributed with means at +({n—2), (n—4), .... —(n—4), —(n—2)S,. This feature demonstrates the 
symmetry of the distribution, and also makes fer a simple calculation of the variance. 

Using c, = $S,, we have the sum of squares of c,. about zero in two parts (a) (n—1) times the sum of 
squares for the distribution ofc, |, together with (6) the contribution due to the displacement of these 
subgroups from zero, i.e. 


Sum of squares c,, = (n— 1) (sum of squares cc, _.) 


== 2 — 2 
+2(n—3)(n—5)...{(*) +(*S) +r}. (2) 


Expanding (1) we get 


ia 3 ~s 2 
af tm—3) (n—5)... ofS) + (“$) + ay 


+{(n—1)(n—5)(n—7)... n{(*S4)'+ FS) +- ar} +@- 1) (n—3)(n—7)(n—9)... (1)} 


2 
—6\2 —g\2 
CS) +(*S) +a} +... ]. 
Dividing by {(m — 1) (n — 3) (#— 5) ... 1} to obtain the variance 


see, of) Ce) 0) (GG) 9 


Varc == 
“ n—-1 n—3 


: ((5)+ ("SAY +-co4 


+ - -... Bs (2) 

















Substituting ! = 4n in (2) we find 
1—1)?+(U—2)*+...(1)® (2—2)8+(2—3)?+... (1)?  (@—3)?+ (2 — 4)? 4+... (1)? 
of P+C—2P+...CUP  C—3P+C—3P +... UP C- 3+ C— 4+ ee 


varc 











a-1 a-3 2-5 
= 7a (l— 1) @—2) (21-3) (2-2) (0-3) (21-5) | ] 
= 2-1 ‘4 2-3 ° 1-5 sae 


= HUl—1)+(l— 1) d—2) + 0-2) @—-3) +...} 
= ppeens-0) Bl 
= 





3 3 
n?—4n n(n — 2) (n+ 2) 
Hence varc, = ——- FO 
ns 72 72 
n?—4n n(n — 2) (n+ 2) 
varS, = — or ———. 
‘2 18 18 


It is necessary to consider whether the distribution approaches normality so that a normal deviate 
test can be applied to an obtained value of S,. As the distribution is symmetrical, the third moment and 
f, are zero. Fourth moments have been computed for the distributions n = 4ton = 20, and /, coefficients 
calculated. At low values of n the distribution is distinctly platykurtic, but the £, value approaches 3 as 
n increases. The values of £, are given at the foot of the tables at the end of this note. 

The fourth moment of S, can be cal<ulated in a mannersimilar to that used above io derive the variance. 
The algebra can, however, be avoided if we note that 4, must be a sextic in n and fit such a sextic to the 
first seven values of the fourth moment, which are as follows: 


n uy 
4 10-6667 
6 247-4667 
8 1719-4667 
10 7244-8000 
12 22892-8000 
i4 59852-8000 


16 136729-6000 








466 


Probability that a given value of S, will be attained or exceeded by chance 


Miscellanea 


(single tail only, distribution symmetrical about zero) 




















n=6 n=8 

0-50000 0-50000 

0-40000 0-42857 

0-20000 0-29524 

0-06667 0-18095 
0-09524 
0-038 1p 
0-00952 | 

=— | il 
2-17 2-42 



































n=16 n=12 n=14 n= 16 n=18 | n=20 S, 
Ss | 
0-50000 | 0-50000 | 0-50000 | 0-50000 | 0-50000 | 0-50000 0 
0-44868 | 0-46080 | 0-46875 | 0-47432 | 0-47842 | 0-48153 2 
0-34921 | 0-38374 | 0-40693 | 0-42336 | 0-43549 | 0-44473 4 
0-25820 | 0-31063 | 0-34717 | 0-37356 | 0-39326 | 0-40838 6 | 
0-17989 | 0-24367 | 0-29069 | 0-32564 | 0-35217 | 0-37276 | 8 | 

| 

0-11640 | 0-18461 | 0-22855 | 0-28025 | 0-31264 | 0-33813 | 10 | 

0-06878 | 0-13499 | 0-19156 | 0-23794 | 0-27502 | 0-30475 | 32 | 
0-03598 | 0-09370 | 0-15023 | 0-19913 | 0-23964 | 0-27283 | 14 
0-01587 | 0-06195 | 0-11483 | 0-16412 0-20673 | 0-24257 | 16 
0-00529 | 0-03848 | 0-08532 | 0-13309 | 0-17649 | 0-21412 | 18 
0-00106 | 0-02213 | 0-06143 | 0-10606 | 0-14903 | 0-18760 | 20 
on 0-01154 | 0-04268 | 0-08296 | 0-12440 | 0-16309 | 22 
ts 0-00529 | 0-02843 | 0-06359 | 0-10258 | 0-14065 | 24 
_ 0-00202 | 0-01814 | 0-04769 | 0-08352 | 0-12028 | 26 
ate 0-00058 | 0-01093 | 0-03492 | 0-06708 | 0-10196 | 28 
ins 0-00010 | 0-00616 | 0-02490 | 0-05310 | 0-08565 | 30 
. na 0-00320 | 0-01725 | 0-04140 | 0-07127 | 32 
cai 0-00150 | 0-01156 | 0-03175 | 0-05871 | 34 

. —_ 0-00061 | 0-00747 | 0-02392 | 0-04786 | 36 | 

2 st 0-00021 | 0-00462 | 0-01768 | 0-03859 | 38 | 

< 0-00005 | 0-00272 | 0-01280 | 0-03076 | 40 | 

_ = 0-00001 | 0-00151 | 0-00906 | 0-02421 | 42 | 

~_ inf ee 0-00078 | 0-00626 | 0-01882 | 44 | 
_ _ _ 0-00037 | 0-00420 | 0-01442 | 46 
— _ _ 0-00016 | 0-00274 | 0-01089 | 48 
— — ~— 0-00006 | 0-00172 | 0-00810 | 50 
-— — —_ 0-00002 | 0-00104 | 0-00592 | 52 
— _ a 0-054 0-00060 | 0-00425 | 54 
— a ni 0-085 0-00033 | 0-00299 | 56 
— _ ~_ p 0-00017 | 0-00206 | 58 

— — ~ 0-00008 | 0-00138 | 60 | 

— — anne _ 0-00004 | 0-00091 | 62 | 
- _ _ si 0-00001 | 0-00058 | 64 
— —_ ae vn 0-055 | 0-00035 | 66 
— _— —_ om 0-051 | 0-00021 | 68 
_ — wom - 0-083 0-00012 | 70 
— os he - 0-073 0-00007 | 72 
—- as ane = at 0-00003 | 74 
— _ aii we ead 0-00002 | 76 
— oa sid a ~ 0-00001 | 78 
— _— a -_ oe 0-053 80 
- — - ose . 0-08 1 82 
— — — om e 0-0°3 84 
_- — a a 0-078 86 
_— _ —_ dee . 0-072 88 
—_ — a ~ = 0-082 90 
2-55 2-63 2-68 2-72 2-76 2-80 pz 



































esectnkes 


ye < 


—_—_—J 














Miscellanea 467 





ee Sea f? n& n> 2Qnt om 2n3 4 4n? 4 8n 
6 ee 108 75. 27° 46 27 225" 
and hence f.~3 Bm 


— 
Thus £, will approach the value 3 as n increases, and hence for reasonable values of n the normal deviate 
test can be employed. 

USES OF METHOD 


It seems to the writer that there are several situations in experimental psychology where this method 
will be of use, in addition to the type of situation used in the example. One such possibility concerns the 
problem of relating preference judgements to particular qualities of the objects judged. As an example, 
in preference judgements about different types of chair, height of seat from the ground may be an im- 
portant factor, but the optimum may vary from individual to individual. If various types of chairs were 
chosen so that they could be considered as a number of pairs with regard to this quality (a kind of ranking 
replication), the preference judgements could be treated in this fashion independently for each judge, 
to discover whether or not seat height is an important criterion in the preferences. 


REFERENCE 
KENDALL, M. G. (1948). Rank Correlation Methods. London: Griffin and Co. 


A note on non-normal! ccerrelation 
By J. B. 8S. HALDANE 


The product-moment correlation p is frequently estimated for two variates which are not normally 
distributed. There are, however, no general expressions for the effect of this non-normal distribution on 
the precision of the estimate of p. They may be obtained in one special case which is of biological import- 
ance. Suppose X and Y are two correlated variates. Then if 

X =a+Ti(l+p)e+(1—p)ty}, Y= b+ ((1+p)'x-(1—p)hy], 
V2 v2 
and further if =y = 0, Z* = y* = 1, and x and y are independent, the variance of X is o*, that of Y is r*, 
and their covariance is por, regardless of the distributions of x and y. Hence the correlation of X and Y 
is p. If x and y are normally distributed the correlation is of course normal. Now in biological statistics 
X and Y may be measurements of two organs in the same individual, or of their logarithms. 2 depends 
on the sum of causes which affect X and Y alike, y on the sum of causes which affect them oppositely. 
For example, in any series of specimens, not al] of which are fully grown, x will increase with age up to 
® certain point; and in a population containing a minority of juvenile members the distribution of x will 
probably be negatively skew. But y may be quite independent of age if the variability of the organs 
measured is uncorrelated with age, and may well be normally distributed when z is not. 

Let x,, be the cumulants of the joint distribution of x and y. Then since they are independent, x,, = 0 
unless r or 8 = 0, Ky = Kg, = 0, Kyo = Kog = 1; and let Ky9 = 1, Kos = Vir Kao = Yar Koa = Yq» Otc., these 
being measures of the deviations from normality of the distributions of x and y. 

Our estimate of p on a sample of n members is thus 


ny n=X, Y,-=X,=Y, 
~ [n=X3—(EX,)*}# (mE ¥3—(=LY,)*}! 


= [n(1 +p) Xa? —n(1 —p) Dy? — (1 +p) (Za,)?+ (1 —p) (Ly,)*] 





x [n(1 +p) Za? + n(1 —p) Ly? + 2n(1 — p*)t Zz, y, — (1 +p) (Za,)? — (1 — p) (Zy,)* — 2(1 — p*)t Xx, Ly, ]-# 
x [n(1 +p) Ea? + n(1 — p) Ly? — 2n(1 — p*)t Ex, y, — (1 + p) (Za,)* — (i +p) (Zy,)* + 2(1 —p*)t Zz, Dy, ]-*. 
So - ((1 +p) {nZax? — (Zax,)*} — (1 — p) {nZy? — (Ly,)*} 7? 
((1 +p) {Xa} — (Zar,)?} + (1 — p) (nZy? — (Zy,)*}]* — 4(1 — p*) (nZ2,y, — Zr, Dy,)* 
((1 +p) kyo — (1 — p) Kos]? 


[1 +p) kyo + (1 —p) Kog}® — 4(1 — p®) kik” a 

















468 Miscellanea 


where k,, is the unbiased estimate of «,, from the moments of the variates in the sample. For example, 
nia? — (Xzx,)? 
n(n—1) 


tributions of x and y from normality. ke exceeds (Keo), or unity, by the sampling variance of ky) which is 
2k K 

ab Keo Ys 
n-l n n 


ke = . We can now ask how the mean value of r? will be affected by deviations of the dis- 


2 —— +—. The effect of non-normality in the distribution of x is therefore to increase the 
7m _— 
mean value of k,2 by y,/n. Similerly, k2 is increased by y;/n. kgoko_ is not increased, since x and y are 


independent. k,}? does not include terms with zero suffixes, so it is also unaltered. In fact, both numerator 
and denominator of (1) are increased by n—{(1 +p)? y,+(1—p)? Yq]. 


We cannot calculate the variance of r directly from (1) since r differs from p by a quantity of order n=}. 
But since both the numerator and denominator are increased by 


(1+pP—24(1— pe or n-[(1+p)* 72+ (1-p)*¥4] 


above the values found when the distributions of z and y are normal, we have in the normal case 
4p? + 4n—1P 
4+4n—1Q ’ 


where P and Q are independent of » to order n-*, and in general 


, 4+ n4P +(1 +p) ret 9)? 7%) 


44+n-[4Q+(1+p)?¥2+(1—p)*¥q] © 


¢= 





-_ _ l— 2 F 
So r— rg =” (1 +p)" 740+ (1p) ¥4] + O(n). 
The variance of r is therefore increased by this quantity. The precision of the estimate of p does not 
therfore depend on the skewness of the distributions of x and y, provided they are mesokurtic. And since 
Ya = V2(1+p) Ty X)+yC¥)) vi = V201—p)*(X)-7(¥)), 


it foilows that skew distribution of X and Y will not affect the precision of r. On the other hand, the 
distributions of X and Y have the same value of y, or £, — 3, namely, 


ly = 2{[(1+p)* vi +(1—p)? 72). 





vas al 
Hence var (r) = —f (1—p?+T,) + O(n-?). (2) 
, . ; 1+r 
If we employ Fisher’s transformation z = }log (=) ,» we find 
—r 
r, 
var (z) = n"1+4 -} + O(n-*). (3) 
1-p? 


The variance of z is thus no longer almost independent of p. But the precision of r is increased if the dis- 
tributions of X and Y are platykurtic, and decreased if they are leptokurtic. Clearly, however, (2) and 
(3) are inapplicable when | p| is near unity, terms of at least order n-* being required. 

On empirical grounds, E. S. Pearson (1931, 1932) stated that ‘the normal bivariate surface may be 
distorted and mutilated to a remarkable degree without affecting the frequency distribution of r’. This 
would scem to be true when | p| is not near unity. It is also true for ‘mutilations’ which affect skewness 
without doing a great deal to kurtosis. However, when correlation is high it would seem that a relatively 
slight change in kurtosis may have a large effect on the variance of r. 

It is possible that the formula (1) might serve as a basis for a new development of the theory of the 
distribution of r in the normal case, and further information could certainly be obtained from it con- 
cerning the more general case here considered. In the most general case x and y, though they have a zero 
coefficient of correlation, are not independent, so such cumulants as «,, would not in general be zero, 
and it is doubtful whether the method would be of value. On the other hand, if the distributions of X 
and Y, though having different values of £,, have insignificantly different values of £,, equations (2) 
or (3) may be used with some confidence. 

I have to thank Mr K. A. Kermack for useful criticism. 


REFERENCE 


Prarson, E. 8. (1931, 1932). The test of significance for the correlation coefficient. J. Amer. Statist. 
Soc. 26, 128; 28, 424. 











10t 
ice 


he 


(2) 





[ 469 ] 


REVIEWS 


Probability Theory for Statistical Methods. By F. N. DAVID. ix+230 pp. Cambridge 
University Press. 1949. Price 15s. 


This book deals with the simpler parts of probability theory, in so far as they are relevant to common 
statistical techniques. It is intended primarily for students who have reached intermediate standard in 
mathematics, but it will be found useful by all who are interested in statistical methods, as it gathers 
together a number of useful results which otherwise are very much scattered throughout the literature. 

The book begins with a discussion of the definition of probability; after considering various viewpoints, 
the author decides on a definition which is adequate for the purposes of the book (except that strictly 
speaking in the form given it applies only to rational probabilities, whereas when the normal distribution 
is being considered we shall need irrational ones too). After this the author goes on to consider the binomial 
distribution. Three ways of summing the tails of a binomial are discussed: the exact method, using the 
incomplete beta function; a not very well known but very convenient continued fraction due to Markoff 
and Miiller; and also the usual normal approximation. 

Various generalizations of the binomial distribution are then discussed: Poisson’s limit, the negative 
binomial, Neyman’s contagious distribution, and, in later chapters, the multinomial distribution, and 
Poisson’s and Lexis’s forms of the binomial for heterogeneous data. Methods are given for fitting most of 
these distributions to observed data. 

A chapter is devoted to Bayes’s Theorem, pointing out the conditions under which it may be validly 
applied, and also to confidence intervals, and their uses. There follows a chapter on simple genetical 
applications of the theory, largely replacing the usual sequence of problems on drawing black and whitc 
balls from urns, throwing dice, etc. This seems a very welcome development, provided that it is accom- 
panied by a sympathetic understanding of at least the elements of genetics, and genes are not going to be 
regarded in future simply as objects conveniently provided by nature for the construction of new types of 
examination questions. Incidentally, Dr David is over-cautious in her discussion of random mating, 
saying that it is ‘rarely met with in practice’, whereas in fact with genes such as the blood-group genes 
it can be shown to hold with considerable accuracy. 

Three chapters deal with random variables, their distributions, ‘expected’ values, and moments, and 
with the law of large numbers. The simpler moments of the sample mean and sample variance are worked 
out in some detail, both for infinite and (in the case of the mean) for finite populations. 

Two chapters on estimation follow. Here the estimates are practically restricted to linear unbiased 
estimates; and the Generalized Markoff Theorem provides a method of obtaining these estimates— 
a proof of the theorem is given for the case of two parameters. One important application of this theorem 
is to linear regression, and another to the problem of efficient sampling of human and other populations. 
Here the methods of stratified and restricted stratified sampling are explained and discussed in some detail. 

The final three chapters are concerned with characteristic functions and their elementary properties. 
Here we are concerned with finding the characteristic function of a known distribution, and with the 
converse, with the connexion between characteristic functions, moments, and cumulants, and with tho 
central limit theorem. 

The style of the book is simple, straightforward, and lucid—refreshingly so; and there are few errors, 
mostly trivial ones. The only two places where the exposition is not quite up to the standard of the rest 
of the book seem to be the ‘problem of points’, p. 26, where the condition r<# might be brought in earlier, 
and the calculation on pp. 96-7, where the sentence ‘we may assume therefore that A, = Rr’ gives 
rise to what seem at first sight rather doubtful calculations of the probabilities of various genotypes 
for A; if, however, we say instead ‘let A, be the parent who passed on the r gene to B,’ then these 
calculations are completely justified. Incidentally the phrase ‘for example, colour-blindness’ which occurs 
in the examination question quoted here seems to be singularly unfortunate, as colour-blindness is not 
inherited in this manner, nor has it a gene frequency as low as 0-001. Other omissions which may cause 
some difficulty occur on pp. 15-16, where the product law is used before it has been proved; in chapter Xv, 
where the fact that two different distributions cannot have the same characteristic function is frequently 
used but nowhere stated; and on p. 214, where the independence of the grouping error is tacitly assumed. 
It is a pity that means and moments are not defined on p. 31, and their simpler relations proved, since 
apart from this omission the book is self-contained. Small errors occur on p. 55, line 6, where the numerator 





2 
should be | k—np | — 4; p. 59, line 8, where the inequality should read ... < (1 _ f) ; p. 63, second formula, 


\ 








470 Reviews 


read (2x +t) for (2n+#) in square brackets; p. 96, line 9, for ‘their offspring’ read ‘a single one of their 
offspring’ (this was expressed badly in the original question); p. 107, line 9, for 105 read 10°; p. 126, 
line 10, for (n/n — 1)# read [n/(n — 1)]*; p. 128, line 4 from foot, for ‘central difference’ read ‘linear differ- 
ence’; p. 147, Markoff’s Lemma should read ‘... < 1/é?’, and similarly on p. 148, the Bienaimé-Tchebycheff 
inequality ‘...>1—1/?’; p. 165, section (iii), ‘the n equations for &(x,)’. Finally, on p. 205, the definition 
of boundedness does not seem to be completely happily phrased. 

There are also a few points on which the reviewer would like to express a personal opinion, although 
perhaps not everyone would agree. The use of (¢ +p)" instead of (say) (¢+¢p)" as a generating function 
(as on p. 30) does not seem very happy, since obviously (q¢+p)" = 1. The consistent use of ‘standard 
error’ for ‘standard deviation’ seems also a little odd. The usual practice for geneticists now is to con- 
centrate attention on gene frequencies, rather than the genotype frequencies considered in chapter VIII; 
although this would probably simplify the calculations, the point is not very important in the context. 
The French spelling ‘Tchebycheff’, though in common use, is irritating, but can perhaps be blamed on 
the man himself, although he was really ‘ Chebyshéf’. And finally, the notation I'(x + 1) for x!, often used 
when z is not integral, seems merely designed to make the formulae unnecessarily complicated. 

Summing up, we may say that the book would be a most useful addition to the library of any 
statistician or student of statistics. 


CEDRIC A. B. SMITH 


The Fundamentals of Statistics. By T. L. Kettey. 755 pp. 


Harvard University Press, and 
Oxford University Press. 1947. Price 55s. 


This book is an introductory text to statistical theory. The author aims at setting out the logical processes 
which aro involved in the application of any piece of statistical technique and generally succeeds with 
admirable clarity. Many who are interested in what statistical methods are about but do not care for 
mathematics may read this text with profit, for the mathematical theory is kept to a minimum in the 
first 200 pages. It is perhaps open to question whether Prof. Kelley achieves the clarity with his mathe- 
matical analysis that he does with his exposition of general principles. The reviewer found his notation 
cumbersome in places, and it is likely that a student coming fresh to the subject will very often find 
himself in difficulties. For example, the section on sequential analysis gives the outline only of the 
mathematical analysis. Reference is givon to other publications whereby the student may supplement 
this analysis, but for the elementary student these other publications will prove too difficult. 

On the whole it would be fair to say that this book is a useful addition to statistical text-books, although 
it will never take the place of such old and tried classics as Yule and Kendall’s Elementary Theory of 
Statistics, for the student beginning the study of statistical methods. Biologists, medicals, etc., will 
probably find it useful to read before attempting R. A. Fisher’s Statistical Methods for Research Workers. 


F. N. DAVID 























eir 


‘er- 














(All Rights reserved) 


BIOMETRIKA. Vol. XXXVI, Parts II and IV 
CONTENTS 


The estimation of the parameters of tolerance distributions. By D. J. Fowwsy . 
An overlap problem arising in particle counting. By P. ArmrraGe . 
Tables of autoregressive series. By M.G. Kenpari  . a ‘ i 
Tables for use in comparisons whose accuracy involves two veriances, sparataly estimated. By 

Auice A. Asprmy . With an Appendix by B. L. Wetcu 3 5 e ¢ ; - 290-296 
Bivariate distributions based on simple translation systems. By N. L. Jonnson ° 297-304 
A test for randomness in a sequence of two alternatives involving a 2x 2 table. By P. G. piaten 305--316 
A general distribution theory for a class of likelihood criteria. By G. E. P. Box Fe _ - 317-346 
Note on approximations to the power function of the ‘2 x 2 comparative trial’. By G.P.Smumto 347-352 
The distribution of ‘Student’s’ ¢ in random samples of any size drawn from non-normal universes. 

By A. K. Gayen . - 5 . ‘ : 353-369 
The combination of probabilities arising from date i in Sncinta distributions. By a. O. RPE ms 370-382 
Note on the application of Fisher’s k-statistics. By F. N. Davip 1 : > 2 . 383-393 
The moments of the z and F distributions. By F. N. Davin c ° 394-403 
The method of frequency-moments and its —— to type VII populations. By B idinieaiinn 8. 

SICHEL . ° - ‘ - 404-425 
Qn the use of Student's t-test in an Garunttattieat site: By S. G. tees . . . 426-430 
Tables of symmetric functions——Part I. By F. N. Davip and M. G. Kenpatr . = : . 431-449 
MiscELLANEA 


On the efficiency of the method of moments and pies! 8 type A distribution. wads L. R. 


SHENTON ° . ° ‘ " : - 450-454 
Large-sample theory of ‘waeaidl estimation. By F. J. eieiaiats 455-458 


A historical note on the method of least squares. By R. L. Puackzrr. 458-460 


The characteristic function of a weighted sum of non-central squares of ade acini 
subject to s linear restraints. By G.I. Bateman . 


Intra-class rank correlation. By J. W. WurrrreLp . 
A note on non-normal correlation. By J. B. 8. Hatpane 


460-462 
463-467 
467-468 
Rsviews 

F. N. Daviw’s ‘Probability Theory for Statistical Methods’ 


469-470 
T. L. Ketiry’s ‘The Fundamentals of Statistics’ 


470 





A volume of Biometrika containing about 400 pages, with plates and tables, will be published annually in two half- 
yearly issues. 
Papers for ; 1blication should either be sent to 


PROFESSOR E. 8. PEARSON, Department of Statistics, University College, London, W.C. 1, 
or if more convenient to 


Dz Jonx Wisxkr, Statistical Laboratory, St Andrew’s Hill, Cambridge. 
Prorrssor M. G. Kenpatt, Cudham Court, Cudham, near Knockholt, Kent. 

It is a condition of publication in Biometrika that the paper shall not already have been issued elsewhere, and will not be 
reprinted without leave of the Editors. 

Contributors receive 25 copies of their papers free. Joint authors 15 copies each. 

‘The subscription price, payable in advance, is Inland 45s. net per volume and Abroud 548. net (including packing and 
postage). Recent volumes may still be obtained at the wrapper price; this is 64s. inland, including postage. Cheques must 
be made payable to Biometrika, crossed “‘a/e Biometrika Trust” and sent to The Secretary, Biometrika Office, Department of 
Statistics, University College, London, W.C. 1, to whom all orders for series, single copies and offprints should be addressed. 
All foreign cheques must be drawn in sterling and on a Bank having a London Agency. 


First Printed in Great Britain at the University Press, Cambridge 
Reprinted by offset-litho by Percy Lund Humphries & Oo., Ltd., Bradford 








ana 


IAI IRI 


on POP IE 
® 


“ 


